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The numerical approximation of Maxwell’s equations, computational electromag- 
netics (CEM), has emerged as a crucial enabling technology for radio-frequency, 
microwave, and wireless engineering. The three most popular “full-wave” meth- 
ods — the Finite Difference Time Domain method, the Method of Moments, and 
the Finite Element Method — are introduced in this book by way of one- or two- 
dimensional problems. Commercial or public domain codes implementing these 
methods are then applied to complex, real-world engineering problems, and a care- 
ful analysis of the reliability of the results obtained is performed, along with a dis- 
cussion of the many pitfalls that can result in inaccurate and misleading solutions. 
The book will empower readers to become discerning users of CEM software, 
with an understanding of the underlying methods, and confidence in the results 
obtained. It also introduces readers to the art of code development. This book has 
a dedicated website making available a number of MATLAB scripts, implementing 
much of the theory discussed, and including additional material on the practical 
applications of CEM. Suitable for senior undergraduate and graduate students tak- 
ing courses on CEM, this would also be a valuable reference book for practicing 
engineers in the industry. 
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Preface 


On graduating twenty years back, in 1984, my first job was as a research engi- 
neer working on computational electromagnetics (CEM) at the National Institute 
for Aeronautical Systems Technology (as it was then called) of the Council for 
Scientific and Industrial Research (CSIR) in Pretoria, South Africa. It was an ex- 
citing time to be working in this field. Although a number of methods had already 
been successfully introduced, including the three which will be discussed in detail 
in this book, major advances were being made in all of these methods, and the 
power of desktop computers was growing in leaps and bounds. No commercial 
programs (or codes, as they are generally called) were then available for RF prob- 
lems, but some US government-sponsored codes, in particular the NEC-2 code, 
were becoming available for general use. 

The 1980s saw the final decade of the Cold War, which in some areas (such 
as Southern Africa) was far from cold. New military technologies, in particular 
stealth, were driving CEM to address progressively more electromagnetically com- 
plex problems. However, when the Cold War ended, far from CEM work coming to 
a halt, new commercial markets, such as the rapidly developing market in mobile 
telephony and personal communication systems, and the proliferation of electronic 
systems in motor vehicles, continued to drive the technology forward at breakneck 
speed throughout the 1990s. This was also due to the widespread availability of 
cheap and progressively more powerful personal computers as a crucial enabling 
technology. 

CEM has now reached a modicum of maturity, with a number of powerful meth- 
ods available, able to solve problems of real engineering interest at radio frequen- 
cies, and with a number of commercial codes available. This has brought a signif- 
icant change in the profile of CEM practitioners, which has not been fully appre- 
ciated in the community at the time of writing. In addition to the traditional group 
of CEM users — largely academics, post-graduate students and research engineers 
at large corporations or research establishments — an entirely new generation of 
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users has arisen. Their interest is typically in using an existing commercial packet 
to solve a particular problem as rapidly as possible. They may well not have any 
post-graduate exposure to CEM methods, and questions which may appear ele- 
mentary to CEM researchers (such as which technique is most appropriate for the 
problem at hand) are actually far from obvious to the beginner in the field; further- 
more, marketing can “hype” a particular implementation/technique to the point 
where it appears omnipotent. Commercial codes aside, even academic papers are 
not free of such bias. 

This book aims to serve the interest of both “traditional’’ CEM users, primar- 
ily academics, researchers and research students, and also this new non-specialist 
user community in industry. The book aims to fill the gap between traditional un- 
dergraduate textbooks, which generally have at most a very cursory discussion of 
numerical methods; antenna texts, which concentrate only on the analysis of an- 
tennas using the methods; and the specialist books on each method which are fre- 
quently formidable reading for students, or unnecessarily detailed for engineers 
whose primary interest is in using the powerful CEM codes now available. In 
this book, the computational methods will generally be introduced using simple 
one-dimensional or two-dimensional examples, so that the core of the method can 
be appreciated without being overwhelmed by the problems of handling complex 
three-dimensional geometries. Following this, the extensions required to deal with 
the real three-dimensional world of RF engineering are outlined, so that one gains 
an appreciation for the operation of complex codes. Such is the complexity of 
general-purpose three-dimensional CEM codes that realistic applications cannot 
be undertaken with anything a post-graduate student can realistically be expected 
to develop during a typical course, and product cycles are too short in industry to 
make the development of general-purpose three-dimensional codes feasible, given 
that off-the-shelf codes are now available. 

Research students will find some features not often described in other books in 
this field, such as how to go about debugging and verifying a CEM code. Industrial 
users should find the discussions of the strengths and weaknesses of each method, 
as well as frequent modelling hints, comprehensive discussions of typical mod- 
elling errors, and the necessity of careful evaluation and verification of results, of 
great interest and utility. In short, the book discusses not only the science of CEM 
modelling, which can be gleaned from (much) reading, but also the art of devel- 
oping and verifying reliable codes and computing reliable data, which is a skill 
generally derived from (sometimes bitter!) experience. 

This book concentrates on the “big three” techniques in CEM — the Finite Dif- 
ference Time Domain (FDTD) method, the Method of Moments (MoM) and the 
Finite Element Method (FEM). It was decided to focus on these three methods, 
since they are the most widely used in the field and all have been implemented 
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in successful commercial codes; some other methods are very briefly discussed so 
that readers are at least aware of them, but this book makes no pretense of address- 
ing these other methods in any detail. Furthermore, the discussion in this book is 
focussed exclusively on applications in RF engineering. Methods such as the FEM 
have been used with great success for magnetostatic problems, such as motor de- 
sign, but this will not be discussed here at all. A feature not often found in other 
books at this level is a discussion of stratified media, using the Sommerfeld po- 
tentials. Although a theoretically advanced topic, the widespread use of integrated 
antennas, especially microstrip, has made an appreciation of at least the basics of 
this approach very important. Finally, the book does not pretend to be a compre- 
hensive text on electromagnetic theory, high-frequency circuit theory, or antenna 
theory and design. There are a number of superb books addressing these topics and 
this book is designed to complement, not compete, with them. Frequent references 
are made to suitable books. 

Readers will also note that the level of the material becomes increasingly so- 
phisticated as the book progresses. This is by design. The FDTD method is the 
only method where one can realistically hope to develop useful code oneself in a 
reasonable timeframe, so the discussion of this method is rather more “nuts and 
bolts” than for the MoM or FEM. CEM methods can also be approached as es- 
sentially an exercise in applied mathematics; although interesting theoretical in- 
sights can be thus gained, it is the author’s experience that engineers do not readily 
take to this approach, certainly not for their initial introduction to the methods, so 
the introductory discussions of at least the FDTD method and MoM draw mainly 
on engineering physics, rather than applied mathematics. Some of the more the- 
oretical approaches to CEM are introduced towards the end of the book, in the 
chapters on the MoM and FEM. (Perhaps because of the enormous amount of 
work on the FEM in applied mechanics, this is probably the method with the most 
well-developed mathematical background.) These include some elementary con- 
cepts from functional analysis, with the associated concepts of inner products and 
weighted residuals, as well as a brief mention of differential forms. A difficult de- 
cision was how much of the great volume of recent advances to reflect in the book. 
Topics such as the fast multipole method have revitalized the MoM in particular, 
and cannot be ignored, but the treatment of this and some other “research frontier” 
material is of necessity cursory. 

A highly problematic issue was the selection of which commercial CEM codes 
to use to illustrate complex real-world implementations. One factor influencing 
this was the availability of a no-cost limited feature version of the software, as in 
the case of the MoM code FEKO; however, the FDTD and FEM codes discussed 
are unfortunately not available in such a format. The discussion tries to high- 
light generic features which a code should offer, and how users can exploit these. 
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User-manual style descriptions of how to use particular codes have been avoided 
as far as possible, so that discussions of one particular code should extend to other 
commercial codes implementing the same method, at least to a degree. At the time 
of writing, FEKO supported a type of scripting language, which has been used in 
places to automate the generation of complex geometries for MoM analysis; the 
constructs (FOR loops, IF- THEN- ELSE conditionals) are felt to be sufficiently 
generic to be useful in other codes supporting similar features. 

Where appropriate, references are provided for further reading. In general, only 
those readily available in the English language archival open literature have been 
listed. On one or two occasions, internal reports have been included. The engi- 
neering community is divided on the use of such references; authors in the USA in 
particular often reference such reports in journal papers, which often prove frus- 
tratingly difficult to locate, sometimes being limited to US distribution only. In 
consequence, this has only been done when there is no other published version 
of the material. A similar problem can be encountered with theses; here, however 
some significant recent research has necessitated limited reference to recent dis- 
sertations, since these results are yet to appear in the archival literature. 

The book draws primarily on the literature of Western science. Much work was 
done on computational electromagnetics in especially the former Soviet Union, but 
unfortunately little has been translated, and what has been is very difficult reading 
for electronic engineers trained in the Western tradition; it also tends to be at a 
much higher theoretical level than the main thrust of this book. 

This book is an outgrowth of notes developed over a fifteen year period for a 
post-graduate course taught by the author at the University of Stellenbosch, South 
Africa, as well as a short course for industry taught by the author and several 
colleagues in 1999. Extensive integration of the material was undertaken during the 
author’s sabbatical visit as a Guest Professor at the Delft University of Technology 
during 2003, where the course was also taught. Chapter 2 is adapted and extended 
from notes originally prepared by James T. Aberle at Arizona State University, 
Tempe, AZ, USA, and he is credited accordingly, but the rest of the authorship is 
that of DBD. 


Acknowledgements 


Stimulating careers are frequently the result of interactions with interesting people, 
and I would like to acknowledge a number of exceptional engineering scientists 
who have either mentored me, worked with me, or studied under me. My late 
father, an electronic engineer, spent much of his career working in the microwave 
and telecommunications industry in the UK and South Africa and sparked my early 
interest in electronic engineering; he started his career during the Second World 
War, working on some of the first radar sets deployed in South Africa (and later 
North Africa and Italy). John Cloete, Wynand Louw, Derek McNamara, and Jan 
Malherbe gave inspiring undergraduate and post-graduate courses at the University 
of Pretoria from 1981-1983, which originally fired my interest in this specific field. 
John and Derek continued as research supervisors for my M.Sc. and Ph.D. research 
on the MoM from 1986-1991. Dirk Baker gave me my first job at the CSIR in 
1984; he is an outstanding antenna engineer and his scepticism of computed results 
was an invaluable baptism of fire. John Cloete offered me the opportunity to join 
the University of Stellenbosch in 1988 and we have continued to interact most 
fruitfully throughout my career. 

Rick Ziolkowski taught me the power of the FDTD method during my post- 
doctoral stay at the University of Arizona in 1993. (Rick made significant contri- 
butions to the method and its applications, especially in complex material mod- 
elling.) Ron Ferrari and Ricky Metaxas kindly hosted me at Cambridge University 
during a sabbatical visit in 1997, where I had the opportunity to enrich greatly 
my knowledge of the FEM during frequent discussions with them and their stu- 
dents. Jim Aberle (Arizona State University) brought novel ideas to the teaching 
of the FDTD as well as spectral domain MoM methods, during a short course we 
taught in 1999; his ideas are reflected in places in this book. Leo Ligthart and Alex 
Yarovoy hosted me during my 2003 sabbatical at Delft University of Technology, 
during which time I initiated the actual writing of this book; their enthusiasm was 
very supportive. 


XV 


XVi Acknowledgements 


Of my research students: in particular, the work of a number of my doctoral 
students is reflected in places in this volume, especially Frans Meyer — who went 
on to co-found Electromagnetic Software and Systems (Pty) Ltd., turning research 
ideas in CEM into commercially successful products — Marianne Bingle, Matthys 
Botha, Pierre Steyn, and Riana Geschke, and I would like to acknowledge their 
dedication to research excellence here. Frans and Matthys’s work in particular is 
described in some detail in the final chapter. I would also like to thank Matthys 
for his proofreading and detailed comments on, and suggestions for, the final two 
chapters, which were most useful. Very useful interactions with a number of engi- 
neers (some of them previously my graduate students) at Electromagnetic Software 
and Systems are also reflected in this book, including Ulrich Jakobus (the original 
author of FEKO), Johann van Tonder, Isak Theron, Gronum Smith, Danie le Roux, 
and Sam Clarke. Many years of continuing discussions on electromagnetics with 
my colleagues at the University of Stellenbosch, in particular John Cloete, Petrie 
Meyer, Howard Reader, and Keith Palmer have also influenced the development 
of this book, as have those with colleagues in electronic engineering in general, in 
particular Dave Weber and the late David Frost. 

I would also like to thank the (South African) National Research Foundation 
and its predecessor, the Foundation for Research Development, for many years 
of research funding, in particular grant-holder bursaries, equipment funding and 
sabbatical support. 

Electromagnetic Software and Systems and Computer Simulation Technology 
kindly provided evaluation copies of FEKO and CST MICROWAVE sTUDIO™ 
respectively. The former also provided the image on which the cover art was based. 
My thanks to Vanessa Weber for the graphic design she produced from this for the 
cover. 

The love and forebearance of my wife Amor, who was bearing our first child 
Bruce during much of the period when this book was in preparation, was essential. 

Finally, the support of the Cambridge University Press team is much appreci- 
ated. 


To the reader 


This book is designed to serve as an introduction to computational electromagnet- 
ics for radio-frequency applications. It assumes the reader has completed typical 
undergraduate courses in electromagnetic field theory, and has some basic knowl- 
edge of antenna design and microwave systems. 

For readers in a hurry, who already know which of the techniques discussed they 
would like to learn more about, it is possible to go directly to the relevant chapters, 
but it would nonetheless be useful first to read the introductory chapter. For those 
in a hurry, but who need first to find out which method (or methods) to use, this 
chapter is essential reading. 

For readers who intend working through most of the book, it would be best 
to work through it in the sequence presented, although the chapters on the Som- 
merfeld formulation and practical applications thereof could be omitted without 
interrupting the sequence of presentation. A more detailed outline of the book may 
be found in Section 1.11; this will also assist readers to locate rapidly the parts of 
the book of interest to them. 

At the end of each chapter, a list of references linked to the chapter topic is 
presented, for further reading and study. 
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Notation 


Throughout this book, the following notation is used. Spatial vectors are indicated 
as E (in this case, the electric field). Vectors in the linear algebra sense are indi- 
cated as {x}, and matrices as [A]. The individual elements of a vector or matrix are 
of course indicated as x; or Aj; respectively. Otherwise, the notation is as generally 
encountered in engineering books on this topic. A summary is presented below. 

The time convention used for phasor quantities is e/®’, hence, an e~/*" plane 
wave propagates in the direction of increasing r. (Note that physics books often 
adopt the e~'“’ convention, in which case the sign also changes in the plane wave 
exponential factor.) 


Vx the curl operation 

V- the divergence operation 

x the vector cross product of two vectors 

E the (field) vector E 

€0 the permittivity of free space (* 8.854 x 107!? F/m) 

Ey relative permittivity of a dielectric material (dimensionless) 
Lo the permeability of free space (417 x 107’ H/m) 

[Ly relative permeability of a magnetic material (dimensionless) 
c the speed of light in free space (+ 2.9979 x 10° m/s) 

Xr wavelength [m/s] or real part of spectral variable kp 


(the meaning will be clear from the context) 

: simplex coordinate i 

O(M") of the order of M”, formally, 

N =O(M")=> lim logN/logM =n 

M>o 

[A] the matrix A 

ij the ith element of matrix A 

{x} the (algebraic) vector x 


XVill 


Xi 


te} 


List of notation 


the ith element of vector {x} 
the Euclidean norm of the vector {x} of length n, 


cH] = yf dopey lal? 
is defined as 
for all 


absolute value of z 
implies 


X1xX 


1 


An overview of computational electromagnetics for RF 
and microwave applications 


Even if we do discover a complete unified theory, it would not mean that we would be able 
to predict events in general ... even if we do find a complete set of basic laws, there will still 
be in the years ahead the intellectually challenging task of developing better approximation 
methods, so that we can make useful predictions of the probable outcomes in complicated 
and realistic situations. 

From [1, pp. 168—169] (the present author’s emphasis). 


Computations: no-one believes them, except the person who made them. 
Measurements: everyone believes them, except the person who made them ... 
Attributed to Professor B. Munk, Ohio State University. 


1.1 Introduction 


Electromagnetics, the study of electrical and magnetic fields and their interaction, 
has been one of the core technologies of the twentieth century, and shows every 
sign of continuing this into the twenty-first. Whilst there are many useful ways 
of subdividing the field, power frequency versus radio frequency, or alternatively 
quasi-static versus full-wave, is one of the most insightful here. This book focusses 
exclusively on radio-frequency, full-wave electromagnetic modelling, as typically 
encountered in communication systems. 

The core of modern electromagnetic engineering is of course Maxwell’s equa- 
tions. Written in modern form,! they are: 


B (1.1) 
a (1.2) 


! Maxwell did not actually write his equations in this form; vector analysis was a late nineteenth-century devel- 
opment. 
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V-D=p (1.3) 
V-B=0 (1.4) 
with the associated constitutive equations 
B=pH (1.5) 
D=cE (1.6) 


The actual solution of the Maxwell equations is complex, and for realistic prob- 
lems, approximations are usually required — as indicated by the introductory quote 
from Hawking, although he had in mind an altogether more ambitious theory (of 
everything!). The numerical approximation of Maxwell’s equations, the subject of 
this book, is known as computational electromagnetics (CEM). 

CEM techniques have been available for close on four decades now. These 
techniques have gestated, grown and matured to the point where they form an 
invaluable part of current RF and microwave engineering practice [2]. However, 
the widespread adoption of computational methods to complement the traditional 
tools of analysis and measurement has attracted criticism, summarized with more 
than a grain of truth by the second quote at the beginning of the chapter. Ironi- 
cally, the availability of powerful, commercial codes may well have made the sit- 
uation worse, not better, since more and more frequently, codes are being applied 
by users unfamiliar with the basic formulations underlying the codes, and not in- 
frequently to problems for which the codes were not designed. One of the major 
aims of this book is to make RF computational electromagnetics comprehensible 
and accessible to a far wider group of RF engineers than has been the case in the 
past. 

CEM is a multi-disciplinary field. Its core disciplines are electromagnetic theory 
and numerical methods, but for useful implementations, geometric modelling and 
visualization, computer science and algorithms all have important roles to play. In 
this book, the focus falls on the core disciplines. 

The applications of CEM are legion, and include antennas, biological EM ef- 
fects, medical diagnosis and treatment, electronic packing and high-speed cir- 
cuitry, superconductivity, microwave devices, monolithic microwave integrated 
circuits, law enforcement, environmental issues, materials, avionics, communica- 
tions, energy generation and conservation, low observable vehicles (stealth), radars 
and imaging, surveillance and intelligence gathering. In this book, we focus pri- 
marily on applications in antennas, wireless communications, radar, and (passive) 
microwave devices, although an example will be given of a biological EM effect 
study. 
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An historical aside — a brief history of electromagnetics 


Interest in static electricity and magnetism, of course, dates back to ancient 
times. The Ancient Greeks circa 400 BC noted that rubbing amber attracted 
bits of straw, and the Chinese reportedly found lodestones (natural magnets) 
circa 2600 BC, first using them for burial purposes, and later for navigation. The 
modern study of electromagnetic phenomena dates to the late eighteenth century, 
with the great progress in experimental methods by Alessandro Volta (1745- 
1827), Hans Christian Oersted (1777-1851) and Michael Faraday (1791-1867) 
on the one hand, and the more mathematical modelling approach of Charles 
Augustin de Coloumb (1736-1806) and André-Marie Ampére (1775-1836) on 
the other. Amongst these, the following milestones stand out: the development 
of the battery by Volta provided a continuous source of electricity for the first 
time; Coloumb’s careful measurements of the electric force resulted in the fa- 
mous inverse square law; Oersted’s 1820 discovery showed that (direct) current 
deflected a magnet; Ampére developed mathematical laws describing this and 
the force between current carrying wires; and finally, Faraday’s crucial contribu- 
tion in 1831 showed that a changing magnetic field sets up an alternating current 
(i.e. an electric field), and for the first time connected two forces of nature which 
until then had been thought quite independent. 

James Clark Maxwell (1831-1879), the most brilliant physicist of the nine- 
teenth century,” combined the work of his predecessors in elegant theoretical 
fashion and postulated that changing electric fields should generate magnetic 
fields; he then showed that this implied wave motion. Hermann Ludwig- 
Ferdinand Helmholtz (1821-1894) was one of the first to recognize the signif- 
icance of Maxwell’s predictions in this regard; in 1888, his student Heinrich 
Rudolph Hertz (1857-1894) showed experimentally that electromagnetic fields 
indeed propagate, and at the speed of light. Oliver Heaviside (1850-1925) also 
made contributions in this regard, although his work is not widely recognized 
nowadays [3]. In what we would now describe as the first commercial spin-off 
of this work, Guglielmo Marconi (1874-1937) was the first to profit financially 
from the emerging field of wireless. 

Electromagnetics was also to have a profound influence on the outstanding 
physicist of the twentieth century, Albert Einstein (1879-1955). Perhaps less 
well known than some of his results — certainly amongst the general public — 
Einstein showed that the magnetic field is the relativistic correction of the electric 


“Maxwell not only unified electricity and magnetism in 1864, he also developed the kinetic theory of gases, 
before his life was cut tragically short by illness. 
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field, confirming the unified field theoretic nature of Maxwell’s electromagnetic 
theory. 

The above is the conventional Western history of electromagnetics. Contribu- 
tions to the theory of light, intimately connected to electromagnetics, were made 
by many over an extremely long period of time, including contributions from 
Arabic scholars. An exceptionally erudite historical perspective may be found 
in [4]. 


1.2 Full-wave CEM techniques 


Full-wave CEM methods approximate the Maxwell equations numerically, with- 
out any initial physical approximations being made. These are also some- 
times called low-frequency methods, to distinguish them from asymptotic high- 
frequency methods, but this can be confusing for several reasons.” The full-wave 
techniques which will be studied in this book are the finite difference time domain 
(FDTD) method; the method of moments (MoM), and the finite element method 
(FEM). Whilst there are other methods available, these are the most widely used, 
and all have been implemented in powerful computer codes. These techniques are 
frequently classified further by whether they are based on integral or differential 
equations, and by whether they operate in the time or frequency domain. We will 
discuss this in the context of each method subsequently. 

Sometimes, the expressions “static” or “quasi-static” will be used. The former 
applies obviously to the situation where one is dealing with either steady-state 
charges (and the associated electric fields) or currents (and the associated mag- 
netic fields). The latter applies to situations where the time rate of change is low 
enough that the fields still satisfy the static equations to a very good approxima- 
tion — or put differently, the 2B term in Eq. (1.1) is negligible (in which case one 
obtains electroquasistatics) or similarly for the 2D term in Eq. (1.2) (which yields 
magnetoquasistatics). A very detailed discussion of these approximations and their 
use may be found in [5]. However, we will not pursue this far in this book, which 
deals almost entirely with full-wave methods. 

There is another class of numerical method for solving the Maxwell equations, 
generally called the asymptotic techniques. These methods require fundamental 
approximations in the Maxwell equations, the validity of which increases asymp- 
totically with frequency. Examples are physical optics (PO), geometrical optics 


2 Firstly, the high-frequency radio band is specifically the spectrum from 3—30 MHz; secondly, the meaning of 
low and high are entirely relative, and the same methods may be, and are, useful from power frequencies up to 
the visible spectrum and beyond; and finally, “high-frequency” as a general term in electronic engineering is 
widely used to distinguish from “power frequency,” with the latter usually using quasi-static approaches. 
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(GO) and the uniform theory of diffraction (UTD). This is a field of study in 
its own right. For suitable problems, these methods are very powerful, but the 
underlying approximations of the physics limits their use for general problems. 
Furthermore, unlike the full-wave methods, where Moore’s law and the resulting 
increase in computer speed and memory continually extend the limits of applica- 
bility, the asymptotic methods have fundamental limits. Hence, in this book, only 
full-wave methods are considered. However, a hybridization with an asymptotic 
technique will be discussed as an example of an advanced application. 

The full-wave techniques are potentially very accurate. Central to all these meth- 
ods is the idea of discretizing some unknown electromagnetic property, typically 
the surface current for the MoM, and the E field for the FEM and FDTD method. 
(For the latter, the H field is also discretized.) This process of discretization is 
also known as meshing. It entails subdividing the geometry into a (large) num- 
ber of small elements. These may be one-dimensional segments, two-dimensional 
surface “patches” (often triangles), three-dimensional tetrahedral elements, or a 
regular three-dimensional “staggered” grid, depending on the problem at hand and 
the method used. Within each element, a simple functional dependence is assumed 
for the spatial variation of the unknown - for instance, a linear approximation — 
but the amplitude (and possibly phase) of the unknown is determined by applica- 
tion of the method to the patchwork of elements which approximates the original 
geometry. This functional dependence is also known as the basis (or expansion) 
function.* 

Generally, the accuracy of the methods is related to the discretization (i.e. mesh 
size). The finer is the mesh, the better is the accuracy of the methods.* The largest 
mesh size (alternatively, the finest geometrical resolution) is limited by the avail- 
able computational resources. In other fields such as structural mechanics, the 
mesh fineness is usually determined by the requirement to resolve the structural 
geometry adequately; in radio-frequency electromagnetics, the requirement on the 
mesh is usually to sample the phase adequately. For many years, the CEM com- 
munity has worked with a rule of thumb of ten segments per wavelength. This was 
originally derived for wire antenna problems, where the mesh is one-dimensional; 
for surfaces, this guideline becomes 100 segments per square wavelength (and 
a similar extension for volumetric meshes to 1000 per cubic wavelength). Much 
work on better elements has been done to reduce this requirement — it will readily 
be appreciated that as the dimensionality of the problem goes up, so this becomes 


3 With the FDTD method as usually introduced, the fields are sampled at points; it is however possible to define 
basis functions for the FDTD, a topic we discuss briefly in Chapter 10. 
This is not invariably true: limitations imposed by approximations in the formulations may place some lower 
bound on element size. A classic example is a thin-wire MoM formulation, where using too many segments 
may violate the underlying thin-wire assumptions. This is discussed in detail in Section 4.3. 
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progressively more crucial. It should also be noted that when very accurate field 
data are required — for example, when computing antenna input impedance — a finer 
mesh may be required, at least locally around the feed point of the antenna. Fur- 
thermore, this guideline ignores the problem of dispersion in differential equation 
based solvers, which effectively requires denser meshes for electromagnetically 
larger problems. 

Although the full-wave methods share the basic idea of discretization, and in- 
deed have been viewed within a very general framework as simply different im- 
plementations of one overarching theoretical formulation, in practice, the methods 
have quite different challenges for theoreticians, code developers and users, as 
well as different optimal areas of application, and as such, they will be consid- 
ered separately in this overview chapter. In Chapter 10, some of the underlying 
mathematical connections between the methods will emerge. 

In the rest of this overview chapter, the MoM, FEM and FDTD method will 
be reviewed qualitatively, emphasizing basic principles such as the underlying 
formulation (integral/differential equation based, frequency or time domain) and 
areas of application (perfectly or highly conducting materials versus homogeneous 
or inhomogeneous penetrable structures; microwave devices versus radiation or 
scattering analysis). This review is especially designed for readers who have a par- 
ticular problem to solve, but are not sure which is the best method to use. Details of 
each method will be found in the subsequent chapters of the book. Key references 
only are given; a far more extensive list of references will be found at the end of 
each chapter. 

By way of introduction, some of the most important characteristics of the MoM, 
FEM and FDTD method are presented in Tables 1.1 and 1.2. Table 1.1 provides a 
comparison of the methods for open region (radiation and scattering) problems. It 
is important to note that what is presented in this table are the key characteristics of 
the method as widely implemented and understood in the CEM community. As will 
be seen in the description of each method in the following sections, a number of 


Table 1.1 Strengths and weaknesses of CEM methods as widely implemented for 
open region problems 


Equation Radiation PEC Homogeneous Inhomogeneous 
Formulation type Domain condition only _ penetrable penetrable 
MoM Integral Frequency Yes 7 7 ~ 
FEM Differential Frequency No ~ 7 VL 
FDTD Differential Time No ~ — VY 


Key: ~ good; — not optimal. 
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Table 1.2 Strengths and weaknesses of CEM methods for guided wave problems 


Equation PEC Homogeneous Inhomogeneous 
Formulation type Domain Wideband only _ penetrable penetrable 
MoM Integral Frequency ~ 7 VY ~ 
FEM Differential Frequency ~ VY VY a 
FDTD Differential Time 4 — 7 — 


Key: ~ good; ~ satisfactory, but not necessarily the best; ~ not optimal. 


simplifications have been made in this table: the MoM, for instance, can be seen in 
a more general sense as including the FEM, although this is not normal usage; and 
to give another example, the FEM can also operate in the time domain, but there 
are no commercial implementations of this at present. For the MoM, homoge- 
neous penetrable materials (dielectrics, for instance) can either be modelled using 
equivalent surface currents or, if the problem consists of layered materials, using a 
Sommerfeld formulation. This has not been noted in the table, since it depends on 
the details of the problem. Table 1.2 provides a similar comparison of the methods 
for guided wave problems.> Again, the details of the precise implementation have 
not been commented on. 


1.3 The method of moments (MoM) 


The MoM is probably the most widely used numerical technique in RF CEM, 
and has a long history in the field; some of this is presented in Chapter 4. For 
antenna engineering, the MoM has been the most widely used CEM method.°® In 
the method of moments, the radiating/scattering structure is replaced by equiv- 
alent currents. These are normally surface currents. (Volumetric currents can be 
used for inhomogeneous dielectric bodies. This is however very expensive com- 
putationally.) This surface current is discretized into wire segments and/or surface 
patches. A matrix equation is then derived, representing the effect of every seg- 
ment/patch on every other segment/patch. This interaction is computed using the 
Green function for the problem. (Green functions will be discussed later in this 


It is tempting to use the term “closed problems” here, but a number of important guiding structures, such as 
microstrip, are partially open. It is assumed in this table that FEM and FDTD codes have an appropriate method 
of terminating this region. Since the energy decays rapidly away from the guiding structure, and this radiation 
is a secondary effect in most applications, the open boundary is usually less problematic here than in the case 
of the radiation and scattering problems. 

The name “method of moments” is peculiar to the CEM community. Perhaps the most descriptive alterna- 
tive name is the “method of weighted residuals.’ The term “boundary element method” is frequently used 
synonymously with MoM, and for surface formulations this is correct, but there are some moment method 
formulations which use volume, not boundary, elements. We discuss this further in Chapter 4. 
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book — indeed, an entire chapter, Chapter 7, is devoted to one such function.) Most 
MoM codes use the free-space Green function. The relevant boundary condition is 
then applied to all the interactions, yielding a set of linear equations. The solution 
of this linear system yields the (approximate) current on each segment/patch. The 
resulting matrix which must be factored (or used in an iterative solution scheme) 
is fully populated, with complex valued entries. Typical matrix dimensions range 
from some hundreds for small antenna problems to several thousand — the upper 
limit is imposed by computational limitations, either limited memory and/or ex- 
cessive run-time. 

Traditionally, the MoM has been applied in the frequency domain, i.e. single 
frequency, or monochromatic, sinusoidal excitation, with an e/®t convention as- 
sumed. The working variables (unknowns) are thus complex valued, with a magni- 
tude and phase, as for any phasor analysis. Time domain integral equation (TDIE) 
formulations have been used on occasions, but stability and other issues have 
proven difficult, and TDIE codes are rare. 

The use of the MoM for antenna analysis was given a major boost by the 
US government’s de facto decision during the late 1980s to release the Numeri- 
cal Electromagnetic Code — Method of Moments (widely known as NEC-2) into 
the public domain. NEC-2 is a powerful, general-purpose antenna modelling pro- 
gram, but with no graphical abilities whatsoever and very limited meshing abilities. 
NEC-2 is discussed in Chapter 5. A later version, NEC-4, added some specialized 
functionality. At present, there are some excellent commercial codes which offer 
all the functionality of NEC-2, but with proper graphical user tools and frequently 
greatly enhanced abilities; examples are FEKO (which will be used quite exten- 
sively in this book), SuperNEC, Ensemble, and IE3D. (Only SuperNEC is a direct 
descendant of NEC, the others are independent implementations.) There are also 
some semi-commercial packages such as GEMACS which are limited to US De- 
partment of Defense contractors, and hence not generally available for commercial 
use world-wide. 

The strong points of the MoM (as usually applied) are the following. 


Efficient treatment of perfectly or highly conducting surfaces. Only the surface is 
meshed; no “air region” around the antenna need be meshed. For wire antennas, the 
treatment is even more efficient, since only a one-dimensional discretization of the wire 
is undertaken. 

The MoM automatically incorporates the “radiation condition” — i.e. the correct behavior 
of the field far from the source (proportional to 1/r in free space). This is very important 
when dealing with radiation or scattering problems. 

The working variable is the current density, from which many important antenna param- 
eters (impedance, gain, radiation patterns etc.) may be derived, some directly and some 
via straightforward numerical integration. 
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e Via the Sommerfeld potentials, efficient formulations may be derived for stratified (lay- 
ered) media. Important examples are printed antennas, components and feed networks 
(e.g. microstrip technology) and antenna-above-real-earth calculations. 

e The availability of NEC-2 in the public domain — this powerful code has served as the 
basis for much MoM based antenna design, and due to the open source nature, has lent 
itself to all manner of numerical experimentation and improvement. 


The weak points of the MoM may be summarized as follows. 


e The MoM does not handle electromagnetically penetrable materials as well as differ- 
ential equation formulations. (If the materials are homogeneous a, fictitious, equivalent 
surface current formulation may be used, but inhomogeneous materials require fictitious 
equivalent volumetric currents, and become very expensive computationally.) 

e The MoM does not scale gracefully with frequency — for typical applications requiring a 
surface mesh, the scaling is O((kd)®) where kd is the electromagnetic size of the struc- 
ture.’ (This assumes a cubic structure, for simplicity.) Note that this is implies an O( f°) 
scaling — doubling the frequency can result in a run-time 64 times as long! We will see 
that this is a major problem with all the computational methods, although the details 
do vary slightly from method to method. For an MoM volumetric mesh, required by an 
inhomogeneous structure, the scaling is O((kd)?); this is so large that such methods are 
usually very limited in application. 

e Some MoM formulations, in particular those based on the magnetic field integral equa- 
tion (MFIE), require the surface to be closed. This is frequently impractical. 


In conclusion, the MoM is the preferred method for frequency domain radiation 
and scattering problems involving perfectly or highly conducting wires and/or sur- 
faces. If the problem involves inhomogeneous dielectric materials, it is unlikely to 
be the best formulation, but if hybridized with the FEM a very efficient formulation 
can result. 


1.4 The finite difference time domain (FDTD) method 


The finite difference time domain (FDTD) method is of a similar vintage to the 
MoM and FEM in electromagnetics, dating back to the 1960s. Like the FEM, it is 
partial differential equation based, and one does not need a Green function. Un- 
like the FEM, the FDTD method does not use variational functionals or weighted 
residuals — it directly approximates the differential operators in the Maxwell curl 
equations, on a grid staggered in time and space. E and H fields are computed 
on a regular grid, with a marching-on-in-time discretization of time, with field 


7 The notation O(x)? means of the (asymptotic) order of and indicates to the highest power (p) present in the 
variable (x); note that it says nothing of the constants. This can be important, since CEM analysis is quite often 
undertaken in the “‘pre-asymptotic” region, where lower powers in x may dominate especially run-time. 


10 An overview of computational electromagnetics 


components being offset by As/2 relative to each other and the E and H fields 
evaluated At/2 apart in time, where As and At are the spatial and temporal dis- 
cretizations respectively. This permits a scheme which uses first-order numerical 
differentiation to provide second-order accuracy. It is also the only widely used 
CEM scheme to operate in the time domain. (Time domain MoM and FEM formu- 
lations have been used, but usually for a rather specialized application. Frequency 
domain finite difference formulations are also available, but again have never be- 
come very popular for general problems.) 

Some history of the FDTD method may be found in Chapter 2. For various rea- 
sons, the method languished in relative obscurity throughout most of the 1960s 
and 1970s, but sprang to prominence in the 1980s. There were both technological 
driving factors behind this — on the one hand, increasing interest in the modelling 
of inhomogeneous materials, in particular for the assessment of human exposure to 
RF fields, and on the other, the development of low-observable “stealth” technol- 
ogy — and enabling technology in the shape of the enormous growth in computer 
power — in particular, memory, for which the FDTD method has a voracious ap- 
petite in three dimensions. The development by Berenger of the perfectly matched 
layer in 1994 solved the previously problematic issue of mesh termination, and 
removed the last hurdle to the widespread adoption of the method. In the new mil- 
lennium, with desktop PCs with hundreds of megabytes available at relatively low 
cost, the FDTD method has firmly established itself as one of the most popular 
methods in CEM, both in industry and academia. The apparent simplicity of the 
basic implementation also means that it is very popular with graduate students in 
the university research community, where “do-it-yourself” FDTD codes are com- 
monly encountered. 

Critics have dismissed the method as a “brute force” technique and, certainly, 
compared to the mathematical elegance and subtleties of a Sommerfeld inte- 
gral formulation, the basic method appears to make limited demands on higher 
mathematics. Most engineers trying to solve tough problems are of course more 
impressed by how well a code works, rather than by how elegant the formulation 
is, and the FDTD method has been enormously successful in many diverse applica- 
tions. Nonetheless, extensions of the FDTD method have required subtle thinking, 
as have stability proofs, so the “brute force” epithet is undeserved. 

The FDTD method is an “explicit” finite difference approach, i.e. no matrix 
equation is set up and solved.® The term comes from the update equations, where 
the field values at the next time-step are given entirely in terms of the field at this 
and the previous time-steps. (Implicit finite difference approaches, where the field 


8 In Chapter 10, it is shown that one can alternatively view the FDTD as derived from a diagonalized matrix 
equation. 
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values at the next time-step at a point in space also involve the values at adjacent 
points at the next time-step, generate a sparse matrix, which must be solved at 
each time-step.) This has the great advantage of keeping the required operations 
very simple — essentially just a stencil involving differencing neighbors in time 
and space — but does mean that the method is not unconditionally stable. This 
means that there is an upper limit on the time-step, and it turns out to be rather less 
than the Nyquist sampling criteria would imply, which is the price one pays for 
the simplicity of the explicit approach. In three dimensions, the stability criterion 
(widely known as the Courant limit) is At < As/ (./3c) where As is the smallest 
grid dimension and c is the speed of light in the mesh. 

There are three very good texts on the FDTD method; Kunz and Luebber’s 
was the first [6], appearing in 1993, but Taflove’s 1995 volume [7] (recently re- 
vised [8]) and the 1998 companion [9] are currently the state of the art. (Kunz 
and Luebbers were unfortunate to publish their book just before the revolutionary 
perfectly matched layer (PML) was invented by Berenger in 1994, although the 
book still contains useful material, not least a working FDTD code. This code has 
served as the basis for a number of academic codes.) 

The time domain formulation of the FDTD method is both an attractive fea- 
ture and a drawback. Using wideband sources, the FDTD method can compute a 
wideband response in one run, whereas frequency domain methods must obviously 
recompute the system response for each frequency point.” However, the majority 
of RF devices operate over quite a narrow frequency band, and this may be less of 
an advantage than one might expect. In particular, for high-Q devices, a very large 
number of time-steps may be required to obtain sufficient frequency resolution.!° 
Many systems exhibit dispersive properties; examples are waveguides and most 
real dielectric and magnetic materials. In the frequency domain, this is simple to 
handle by the obvious expedient of simply changing the material/device properties 
with frequency, but in the time domain this is more challenging since a convolution 
is implied. Many techniques have been proposed and implemented to address this 
issue, but do complicate the method somewhat [8, 9]. 

Although there are some FDTD codes available on the internet, they are really 
“toy” codes by comparison with NEC-2, for instance. Commercial versions are 
available, including CST MICROWAVE STUDIO® and REMCON’s XFDTD; the 
former actually uses the finite integration technique, but this is very closely related 
to the FDTD. It is perhaps surprising that more contenders have not emerged, but 
this is in no small part because a useful commercial code has to incorporate not 


9 Continuing research for frequency domain codes on model based parameter estimation (MBPE) aims to reduce 
dramatically the number of frequency points required, and good results have been obtained; some commercial 
frequency domain codes already incorporate this. 

Again, work similar to the MBPE, using system approximation techniques, can assist here. 


12 An overview of computational electromagnetics 


only a decent user interface, but also a number of extensions to the standard FDTD 
to make it generally useful. 


The strong points of the FDTD method are the following. 


Exceptionally simple implementation for a full-wave solver — at least an order of mag- 
nitude less work than either an MoM or FEM implementation for a basic FDTD imple- 
mentation. (One should be warned however that there are a number of subtleties which 
can take a while to appreciate, even with an apparently simple problem. Also, many 
practical problems require more than just a basic implementation, and the simplicity of 
the method is often compromised by these extra factors.) It is the only method which 
one can realistically implement oneself in a reasonable timeframe, although even then 
only for quite specific problems. 

Very straightforward treatment of material inhomogeneities (as for the FEM).!! 

Fairly accurate geometrical modelling ability (but not as versatile as the FEM in this 
regard, due to the “stair-stepping” effect of the regular mesh — see comments below on 
non-orthogonal grids). (Commercial codes frequently include extensions to the method 
to improve this, so it is not necessarily a problem.) 

Since the method is a time domain one, wideband data are potentially available from one 
run. 

Reasonable scaling behavior, O((kd )°5), with the same N « (kd)!° assumption to con- 
trol dispersion as for the FEM, which we will discuss shortly. Note that as for the FEM, 
this is not affected by the material composition of the structure. For wideband systems, 
this is very attractive, since the other methods have an implied f;,, multiplicative term (not 
shown in the preceding sections), where f, is the number of frequency points required. 

The PML has made implementing very good absorbing boundary conditions as mesh 
termination relatively straightforward. 


The main drawbacks are the following. 


Inflexible meshing — much work has been done on non-orthogonal FDTD grids, but the 
method then loses much of its appealing simplicity. 

Some uncertainty about the precise position of boundaries — usually an uncertainly of 
about As /2. This is due to the offset nature of the E and H field grids. 

Dispersive materials require considerable effort to implement correctly — but it is possi- 
ble and good results have been obtained. 

As with the FEM, the FDTD method is not as efficient as the MoM when modelling 
structures consisting entirely of perfectly or highly conducting radiators/scatterers. 


A point worth making here is that for typical RF applications, the dielectric properties of materials are usually 
the most significant, and relative permittivities at RF and microwave frequencies rarely exceed single figures. 
For low-frequency magnetoquasistatic problems, magnetic properties are often the most significant, with rela- 
tive permeabilities which can be very large indeed. In this case, the matter is not quite as simple when accurate 
modelling is desired, and both the FDTD method and the FEM can exhibit problems. This, however, is not the 
focus of this book. 

Work has been done on improving this, typically using some averaging of properties in the FDTD cells on the 
boundaries, but this can impact on the second-order accuracy of the method. 
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e Although considerable theoretical work has been done on higher-order FDTD ap- 
proaches, none appears to have been successfully implemented in a general-purpose 
FDTD code. The problem is intimately linked to that mentioned above regarding the 
ambiguity of the boundary positions. 


In conclusion, the FDTD method is the preferred method for wideband systems. 
Even in its standard Yee form, it is also a strong contender for any electromagnetic 
radiation or scattering problem for which quick answers are needed, great accuracy 
is not the primary concern, and quite large run-times and memory usage is accept- 
able. Furthermore, by using a sufficiently fine mesh, and in particular using various 
extensions to the standard FDTD method, very accurate results may be obtained; 
the potential accuracy of the FDTD method should not be underestimated. 


1.5 The finite element method (FEM) 


The finite element method (FEM) has been widely used in structural mechanics 
and thermodynamics; its first application in the modern form dates to the 1950s, 
although its mathematical roots are older, and the first application in electromag- 
netics was undertaken in the late 1960s. Chapter 9 gives some more historical 
background on the method. 

As with its main competitor, the FDTD method, the FEM handles inhomoge- 
neous materials and complex geometries with aplomb; these become problems in 
mesh generation rather than in electromagnetic theory. The FEM may be derived 
from two viewpoints: one uses variational analysis, the other weighted residuals. 
Both start with the partial differential equation (PDE) form of Maxwell’s equa- 
tions. The former finds a variational functional whose minimum!? corresponds 
with the solution of the PDE, subject to certain boundary conditions. The lat- 
ter also starts with the PDE form of Maxwell’s equations, and then introduces 
a “weighted” residual (error); using Green’s theorem, one of the differentials in 
the PDE is “shifted” to the weighting functions.'* For most applications, these 
procedures result in identical equations. In both cases, the unknown field is dis- 
cretized using a finite element mesh; typically, triangular elements are used for 
surface meshes and tetrahedrons for volumetric meshes, although many other types 
of elements are available. Triangles and tetrahedrons have certain attractive prop- 
erties best summarized as “simplicial’” — these are the simplest geometrical forms 
with which two-dimensional and three-dimensional regions respectively can be 
meshed. 


13 More precisely, extremal point, since it may also be a maximum or stationary point. 

14 From which comes the name “weak” formulation, sometimes encountered in the literature, since the finite 
element basis functions need only be once differentiable, whereas the wave equation has second-order deriva- 
tives. 
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Finite element analysis (FEA) can handle two different types of problem, viz. 
eigenanalysis (source-free) and deterministic (driven) FEA problems. !> Problems 
without any internal (or external) field source fall into the category of eigenanal- 
ysis problems. A classic example is a cavity resonator. What emerges from the 
analysis is a set of eigenvalues and associated eigenmodes; these represent the 
resonant frequencies and associated field distribution within the cavity. For mi- 
crowave dielectric heating, this information can be used to design feed locations 
and optimize load positioning.!® It should be noted that eigenanalysis applications 
are neither time nor frequency but rather eigenvalue domain solvers; using a sim- 
ple transformation, it is possible to include operating frequency in a waveguide 
simulation, to compute dispersion curves. 

Deterministic problems analyzed using FEA involve a source; the response of 
the structure to this excitation is then computed. This represents a very large class 
of electromagnetic engineering applications of the FEM, including antenna, radar 
cross-section, microwave circuit and periodic structure analyses. 

As with the FDTD method, the FEM does not include the radiation condition. 
For closed regions (e.g. waveguide devices or cavities) this is of no concern. How- 
ever, for open regions (e.g. radiation or scattering problems), this requires spe- 
cial treatment, and this must be incorporated using either an artificial absorbing 
region within the mesh (the numerical analogy of an anechoic chamber) or using a 
hybridization with the MoM to terminate the mesh. 

Traditionally, the FEM has been formulated in the frequency domain, although 
time domain formulations have also been used for specialized applications. 

There are a number of excellent and up-to-date texts on the FEM, including 
those by Jin (revised in 2002) [10], Silvester and Ferrari [11] (the third edition 
appearing in 1996), Volakis et al. [12] and Peterson ef al. [13] (both published in 
1998, the last also including comprehensive coverage of the MoM). The collection 
of papers edited by Silvester and Pelosi [14] is also very useful, although quite 
a number of significant papers have appeared since its 1994 publication. Another 
useful source is the 1996 volume edited by Itoh et al. [15]. 

Several companies market commercial finite element products for radio- 
frequency electromagnetics. Ansoft’s HFSS package is widely regarded as the 
market leader; Ansys have a suitable product, and a fairly recent entry, FEMLAB, 
has also attracted users. 


'5 The MoM and finite difference methods in general can also be used for eigenanalysis, but are not very com- 
monly encountered. Harrington’s original text on the MoM included a chapter on eigenvalue problems, but the 
MoM has not been as widely used as the FEM for this class of problem. The FDTD method is by definition 
deterministic, and requires a source. 

6 In this real-world application, there is now a source and the problem is strictly speaking no longer an eigen- 
analysis one, but the source location can be optimized by knowledge of the resonant field behavior within the 
cavity, since these fields are what one is attempting to excite with the feed. 
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The strong points of the FEM are the following. 


Very straightforward treatment of complex geometries and material inhomogeneities. 
Very simple handling of dispersive materials (i.e. materials with frequency-dependent 
properties). 

Ability to handle eigenproblems as above. 

Potentially better frequency scaling than the MoM - although the requirement to mesh 
a volume rather than a surface means that the number of unknowns in the problem is 
usually much larger.!” 

Straightforward extension to higher-order basis functions. The FEM lends itself to the 
use of higher-order basis functions; although the book-keeping within an FEM code is a 
little complicated by this, the theoretical extensions are now well understood. It is also 
possible to use conformal elements to better approximate curved geometries. 
“Multi-physics” potential — this means the ability to couple EM solutions with, for in- 
stance, mechanical or thermal solutions. Due no doubt to the widespread popularity 
and maturity of the FEM in other fields of engineering, one is starting to see pack- 
ages which can compute such coupled solutions. It is probably only significant in high- 
power applications, where thermal effects can be important — either desired, as in the 
case of microwave dielectric heating, or undesired, such as with high-power transmitter 
design. 


The weak points of the FEM include the following. 


Inefficient treatment of highly conducting radiators when compared to the MoM (due to 
the requirement to have some mesh between the radiator and the absorber).!® 

The FEM meshes can become very complex for large three-dimensional structures — 
indeed, some workers have reported mesh generation times starting to exceed solution 
time. 

The FEM is rather more complex to implement than the FDTD method. This impacts in 
particular in terms of the suitability of the FEM for parallel computing. It also implies 
that “home-built” FEM codes are quite rare compared to such FDTD codes. 

Efficient preconditioned iterative solvers are required when higher-order elements 
are used; so important is this in commercial applications that these are usually 
treated as proprietary information, making “do-it-yourself” implementation even more 
challenging. 


The exact scaling behavior depends on how efficiently the sparsity of the finite element matrix can be ex- 
ploited — and the sparsity pattern is problem dependent. The lowest bound on this is O(N), N being the 
number of degrees of freedom (unknowns). For a scheme with second-order accuracy, N « (kd) 5, where 
the exponential indicates that as the problem size grows, so the mesh must become proportionally finer to 
control mesh dispersion. This effect was often overlooked in earlier analyses of differential equation based 
solvers. With these assumptions, the lowest bound on the FEM is O((kd)*5); it must be emphasized that 
this is a lowest bound and assumes that matrix sparsity is essentially fixed, which is not so in reality. A more 
realistic estimate is probably O((kd)>) - O((kd)®). 

The FEM/MoM hybrids overcome this problem. 
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In conclusion, the FEM is the preferred method for microwave device simula- 
tion and eigenproblem analysis. Using FEM/MoM hybrids, scattering problems 
involving electromagnetically penetrable media and specialized antenna problems 
can be accurately and efficiently solved. 


1.6 Other methods 


The MoM, FEM and FDTD method are the most popular methods in current use. 
There are a number of other methods which will be encountered in the literature, 
and some commercial codes are based on these methods. Here we will briefly 
outline some of them. 


1.6.1 Transmission line matrix (TLM) method 


The TLM method is conceptually very similar to the FDTD method. Instead of 
directly discretizing the Maxwell equations, an equivalent array of short transmis- 
sion lines is used. The method is appealing to engineers with a strong circuit but 
weak field background, but for most CEM practitioners the circuit approximation 
of the field equations seems rather circuitous. It should be commented that the 
circuit approach can be more direct than the FDTD approach when one is deal- 
ing with high-frequency circuits, which is a major reason for continuing work on 
the TLM. The method has a dedicated following in some circles, and at least one 
commercial code, Micro-Stripes, is available. 


1.6.2 The method of lines (MoL) 


The MoL is a specialized method for primarily waveguiding structures. It uses 
a semi-analytical solution along a number of lines (in its two-dimensional form) 
and is especially memory efficient. It is also very accurate. Because it requires the 
extraction of eigenvalues, it can be computationally expensive. Most MoL applica- 
tions can be done as well with an FEM formulation. A commercial implementation 
does not appear to be available at present. 


1.6.3 The generalized multipole technique (GMT) 


The GMT uses multipoles as the basis functions; these are special function so- 
lutions of the Maxwell equations. It is not especially similar to any of the meth- 
ods that we have discussed thus far. It does require some intelligent user input in 
terms of placing the multipoles. Good results have been obtained for a variety of 
problems; a good reference is Hafner’s book [16], which is also useful for placing 
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the other methods in perspective. (The book will appear somewhat idiosyncratic 
though for CEM novices.) 


1.7 The CEM modelling process 


Before we now proceed to study these methods in more detail in subsequent chap- 
ters, it is useful to comment on the modelling process in general. Some astonishing 
claims have been made about the predictive power of Maxwell’s equations [17, 
p. 4]: 


Most physicists believe that if you lock a graduate student in a room and have him perform 
an electromagnetic calculation correctly, and if you perform an experiment that does not 
agree with the graduate student’s calculation, then you better check your experiment. 


Whilst at its heart this observation is true, in that we believe that for non-quantum 
interactions, Maxwell’s equations provide a complete description of electromag- 
netic phenomena,!? for many aspects of CEM modelling, one needs to be ex- 
tremely cautious of such sentiments. The modelling process is about the art of 
acceptable approximation, and this path is strewn with pitfalls. 

Firstly, we are replacing the real world with a mathematical model, or put dif- 
ferently, replacing a real field problem with an approximate one. Here are some 
examples of possible problems. 


Limitations of the mathematical model Mathematical models of electromagnetic 
devices usually have some underlying assumptions. An example is the infinitely 
large planar ground assumed in a Sommerfeld formulation. Most integrated anten- 
nas radiate primarily on broadside, so the finite ground of a real antenna is usually 
not a problem. However, endfire integrated antennas (a Yagi, for instance, photo- 
etched on a printed circuit board) radiate most strongly along the ground interface, 
and the main beam on endfire apparently disappears when a Sommerfeld code is 
used. 


Tolerances Any engineered system has some measure of tolerances. Some are re- 
ally of little concern; others impact directly on device performance. An example 
of the former is surface roughness of average dimensions far less than a wave- 
length; this usually has little impact on the operation of the system. An example of 
the latter are uncertainties in dielectric constant or overall device dimensions. For 
antennas relying on standing-wave operation (most wire antennas, microstrip 


19 By replacing the field vectors with operators, Maxwell’s equations become quantum theoretically correct. 
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patch structures etc.) these translate directly into variations in resonant frequency. 
(Since such devices are usually quite narrowband, this can be highly problem- 
atic.) 


Manufacturing deviations The device which is being simulated may differ subtly 
from what was designed and analyzed, in a more fundamental way than simply 
due to device tolerances. An example of a frequency selective surface with such a 
problem will be discussed in the next section in detail; there, an air inclusion was 
de-tuning the device, and measurements and computations stubbornly refused to 
agree until the problem was identified. 


Simplifications in the formulation Currents flowing on thin wires are usually ap- 
proximated by current filaments (in other words, wire thickness is ignored). This 
can cause problems when the wire thickness no longer justifies this assump- 
tion. 


Once we have an acceptable approximate field problem, it will then be solved using 
an approximate numerical solution. Once again, there are many pitfalls in this next 
step. 


Finite discretization This is usually the biggest single limitation on the accuracy of 
numerical techniques in electromagnetics. There are typically two different types 
of error which accompany this: one is interpolation error, and is the (in)ability 
of the basis functions to model the field locally; the other is mesh dispersion et- 
ror (also sometimes called pollution error) which is cumulative error through the 
mesh.”° Both can usually be controlled by refining the mesh. Unfortunately, the 
computational cost of especially three-dimensional modelling is such that this is 
not always practical. 


Finite problem space Neither the FDTD method nor the FEM incorporates the 
radiation condition, and the mesh needs to be truncated at some point when a 
radiation or scattering problem is undertaken. Absorbing boundary conditions are 
widely used for this. After creating an (in)adequately refined mesh, this is probably 
the second largest source of error in FDTD and FEM computations; in reality, the 
problems are interwoven, since a poor mesh termination scheme requires a larger 
solution region, which in turn makes it difficult to ensure that the mesh is fine 
enough. 


20 The MoM does not suffer from mesh dispersion, only the differential equation based FDTD method and 
FEM do. 
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Numerical approximations FEM and MoM codes in particular require numerical 
integration. This is usually done using quadrature or cubature (multi-dimensional 
numerical integration), and if not carefully done, may easily result in poor perfor- 
mance, in particular for the MoM which involves the integration of nearly singular 
or singular fields. 


Finite machine precision The infinite number of real (or complex) numbers are 
represented on a computer in a very finite fashion. Typically, in single precision, 
4 bytes (i.e. 16 bits) is used to represent a real number; this gives some 5 signifi- 
cant digits of accuracy. (Double precision uses 8 bytes, and approximately doubles 
this.) For problems which are ill conditioned (that is, the solution depends rather 
drastically on small changes in input data) this may not be adequate. This is usually 
a less serious problem than the others. 


1.8 Verification and validation 


The discussion of the previous section leads directly to the issues of the verification 
and validation of code. One might define the former as ensuring that the code has 
correctly implemented the formulation, and the latter as checking that the formu- 
lation as implemented in the code produces results agreeing with reality. However, 
for users of codes in practice, the processes are integrated, especially since users 
cannot change commercial codes. Throughout this book, the necessity of validat- 
ing and verifying code will be continually emphasized, but it is such an important 
topic in CEM that it deserves this section on its own. 
There are several methods currently in widespread use. 


Comparison with analytically computed solutions This was the classical approach 
taken by most of the early researchers in the field. Typically, radiation or scattering 
solutions involving canonical shapes (usually cylinders in two dimensions, and 
spheres in three dimensions) were used to compare results with those of the code. 
The problem with this is that it is a necessary, but not sufficient, condition for a 
code to be working correctly. 


Comparison with approximate solutions Quite often, approximate solutions of 
electromagnetic problems are known from simplified models, which have usu- 
ally been experimentally tested, or may even represent experimental data. Many 
antennas are a good example of this, with parameters such as gain often a design 
parameter. Comparison of computed results with these provides some reassurance 
that the code is in the correct “ballpark,” although of course this is not a rigorous 
process. 
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Comparison with measurements In a sense, this is the most satisfying and con- 
vincing method. However, it is strewn with difficulties. Unlike CEM, where the 
basic tools have dropped enormously in price, radio-frequency measurement de- 
vices have remained expensive, and accurate measurements of radiation or scatter- 
ing require an anechoic chamber. Making reliable and repeatable measurements is 
also both a science and an art, and usually requires considerable experience. 


Comparison with other CEM codes This is a relatively recent innovation, promp- 
ted both by the availability of powerful general-purpose codes and the difficulty 
of obtaining reliable measured data. Once again, caution is required. This is one 
place where the difference between validation and verification can be significant: 
to give an example, validating a thin-wire code by comparison to another thin- 
wire code will not detect a fundamental problem with the thin-wire assumption. 
In general, this is most convincingly done by comparing results computed using 
codes implementing different formulations. 


We will see examples of the use of all these techniques throughout this book. 


1.8.1 An example: a frequency selective surface 


The process of validating computations can sometimes lead to enhanced under- 
standing of the device under test. An example is the following, originally pre- 
sented in [18, 19] for a device called a frequency selective surface (FSS). There 
are various applications thereof: in this case, a bandpass radome was required. The 
structure consists of a slot cut in a metallic sheet, which transmits an incident wave 
when the frequency is such that the slot is resonant. 

When an FSS is fabricated, a dielectric support is generally required, lowering 
the resonant frequency and complicating the analysis. An FDTD code, originally 
developed by the author, was used to simulate the dielectric support. However, 
initial results yielded a consistent offset in center frequency between measured and 
computed data, which was sufficiently large to be problematic. The usual FDTD 
checks — refining the mesh, and moving the absorbing boundary condition further 
away — did not solve the problem. 

Eventually, the problem was traced to a very subtle manufacturing problem. 
When manufacturing a dielectric-supported FSS structure, the finite thickness of 
the metal screen can be surprisingly significant, whether a sandwiched or single- 
sided support is used; this results effectively in a slot. Although the slot is small in 
cross-section, the material filling it plays a significant role in the electromagnetic 
behavior of the device. An example of the slot forming the FSS element in a finite 
thickness conductor is illustrated in Fig. 1.1. This is a cross-section of a circular 
ring FSS element, with diameter 5.9055 mm, slot width 0.537 mm, and element 
spacing 10.738 mm. 
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Figure 1.1 A cross-section of a slot forming the FSS in a conductor of finite thick- 
ness, w, sandwiched between two dielectrics. (Reproduced by permission of IEE from 


[18].) 
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Figure 1.2 Predicted transmission coefficient of an O-ring FSS with one side perspex only. 
Legend: solid line (0), infinitely thin metal sheet, single-cell perspex in the slot; solid line 
(x), infinitely thin metal sheet, single-cell air in the slot; dashed line (+), actual 0.26845 
mm thick metal sheet; air in the slot. (Reproduced by permission of IEE from [18].) 


The FDTD code can accurately predict the effect of different dielectrics, pro- 
vided the significance of this effect is realized and correctly modelled. Figure 1.2 
shows this effect clearly; the resonant frequency is off by around 13% for the 
FDTD model which (incorrectly) assigned perspex to the slot. Two of the mod- 
els in Fig. 1.2 used an infinitely thin metal sheet in the FDTD model; in one case 
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Figure 1.3 Transmission coefficient of PVC sandwiched FSS with petroleum jelly fill- 
ing in the slot. The solid line is measured data, the broken line is the FDTD simulation. 
(Reproduced by permission of SAIEFE from [19].) 


perspex of a single FDTD cell thickness was used to model the cavity formed by 
the slot; in the other case air was assigned to the slot cavity — however, the depth of 
the actual slot was not entirely correctly simulated. The final model used the cor- 
rect actual metal thickness and the slot was air filled, also to the correct thickness. 
The difference in predicted resonance frequency is significant. 

Figure 1.3 shows measured and predicted results at normal incidence for a hor- 
izontal tri-slot sandwiched between PVC (€, = 2.86), with petroleum jelly (Vase- 
line, €, = 2.16) used to fill the slot which has been carefully modelled with the 
FDTD mesh. (This particular tri-slot had arm length 3.732 mm, arm width 1.0 mm, 
and inter-element spacing 12.5 mm.) The results demonstrate the accuracy achiev- 
able with careful modelling. 

This is an example of a discrepancy between the real and approximate field 
problems, due in this case to a manufacturing problem. It is especially useful in 
that it led to improved understanding of the design, and a revised manufactur- 
ing process. CEM tools (the FDTD in this case) allowed very quick experimen- 
tation to establish that the air inclusion (in this case) was the problem; laboratory 
experiments with various prototypes would have been very tedious and time con- 
suming indeed. 


1.9 Extending the limits of full-wave CEM methods 


It should be clear from the preceding sections that no one CEM method should lay 
claim to being able to address all problems with optimum efficiency. Both the FEM 
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and FDTD method are theoretically capable of addressing any arbitrary problem, 
but both can be unnecessarily or prohibitively expensive computationally for prob- 
lems more suited to the MoM, and in practice of course no full-wave method still 
works at asymptotically high frequencies. 

The computational cost of the major methods has been briefly reviewed in this 
chapter. Although the individual methods vary significantly, it is clear that in the 
case of all the full-wave solvers, as the frequency increases, so the mesh must 
become finer (and thus the number of unknowns become larger). We have seen 
that although the methods have different scaling properties, all of them scale badly 
with frequency. Even the most attractive scales at around the fifth to sixth power of 
frequency.*! Much ingenuity has been devoted to developing new or modified full- 
wave methods with better scaling behavior — the present class of “fast methods” 
being an excellent case in point — and to exploiting high-performance computers, 
in particular parallel computing. 

“Fast” methods, including the fast multipole method and the adaptive integral 
method, aim to reduce the asymptotic cost of the methods. Put very simply, the 
methods replace the traditional direct matrix solution algorithms with iterative 
solvers, and use methods to approximate the interaction between parts of the mesh 
which are separated by some reasonable distance (usually at least a few wave- 
lengths). The matrix-vector product — which lies at the heart of iterative solvers — 
is implemented using a fast technique similar to the FFT, which reduces the cost 
from O(N7) to O(N log N) per iteration. Recent work has claimed an asymp- 
totic dependence of O(kd)*. This appears very attractive indeed, but one should 
be warned that this is an asymptotic calculation and there are some very large con- 
stants associated with this, as well as some possibly optimistic assumptions about 
rapid convergence of the iterative solver. Hence this attractive scaling behavior 
only manifests itself for electromagnetically very large problems. The convergence 
of iterative methods is also very problem dependent, so a particular analysis may 
not yield the expected asymptotic behavior if the solver should converge unexpect- 
edly slowly. For an overview of recent progress on fast methods, see Chew et al.’s 
review paper [20]. We will discuss fast methods in Chapter 6. 

Whilst great advances have been made, the full-wave techniques eventually 
make impossible demands on even the largest supercomputers, and asymptotic 
techniques become important. These methods generally use rays as field propaga- 
tors, and essentially localize electromagnetic interaction, describing the field at a 
point as the sum of the direct, reflected, and various diffracted rays, all of which 
originate at points (or sometimes lines) on the structure. With these methods, there 
is no concept of discretization of an unknown field — although the surface may 


ah Assuming at least a surface discretization for the MoM. 


24 An overview of computational electromagnetics 


well be approximated by facets.2” These methods generally rely on the asymp- 
totic nature of some underlying integral or series solution, and the approximation 
improves with frequency. An excellent overview of this may be found in [21]. Be- 
cause the asymptotic techniques rely on approximations of the physics from the 
start, they do not lend themselves as well as the full-wave methods to general- 
purpose computer programs. However, when hybridized with the full-wave meth- 
ods, some very significant extensions to the frequency range of full-wave codes 
become possible. Jakobus has made significant contributions recently using hybrid 
MoM/PO approaches [22, 23] and much of this work is reflected in the commer- 
cial program FEKO. Again, in Chapter 6, hybrid methods will be discussed in 
more detail. 

To paraphrase Hafner [16], CEM is a field which depends not just on “big ideas,” 
but also on getting lots of details right. This chapter has concentrated on the former, 
with the aim of providing the CEM beginner with some idea of what method is 
appropriate for what problem. Actually implementing a reliable CEM code makes 
enormous demands on the latter, and requires an on-going process of validation. 
One should be warned that even the most apparently straightforward method (the 
FDTD) is not as straightforward to implement as one might expect; development 
times for even the most specialized CEM codes involve at least months of work, 
and powerful, general-purpose codes involve many years of effort. 


1.10 CEM: the future 


CEM has passed through several phases: the 1960s and 1970s saw primarily work 
on CEM formulations; the 1980s saw the techniques starting to receive significant 
acceptance by non-specialists, and the 1990s saw the first widely available com- 
mercial codes for radio-frequency electromagnetic problems appear on the market. 
What does the next decade or so hold in store? 

Firstly, it appears that we can look forward to continuing giant strides in com- 
puter performance. Looking back over the last decade, a typical PC has increased 
its clock-speed from some tens to megahertz to over a gigahertz, while memory 
sizes have grown from under one megabyte to hundreds of megabytes, and disk 
sizes have increased from ten or twenty megabytes to the same or more in giga- 
bytes. (Workstations have also grown greatly in power, although their edge over 
top-end PCs is rather tenuous compared to the situation a decade ago.) This revo- 
lution in affordable computing has revolutionized potential CEM applications for 
engineers based in industry. 

22 Tn the case of physical optics (PO), the surface current is indeed discretized, but the amplitude of the current 


is assumed in terms of the known incident field, rather than being computed from a matrix equation enforcing 
a boundary condition. 
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CEM theory has also advanced enormously since the first work in the 1960s, 
and far more RF electromagnetic problems are potentially amenable to a CEM 
solution. Much work can be expected on hybridizing methods — the benefits of 
FEM/MoM and MoM/PO hybrids have been noted in this chapter, and will be 
discussed in more detail later in this book. 

Intelligently refining meshes automatically is also an important topic, both in the 
research community and in commercial codes. Closely linked to this are methods 
for estimating errors in computed solutions; how this can be done will be briefly 
described in Chapter 10. Another significant trend is the incorporation of automatic 
optimizers using full-wave CEM tools for the analysis part of each iterative step 
in the optimization procedure. A number of commercial packages are starting to 
incorporate such abilities. 

It can also be expected that the user interfaces will continue to improve, mak- 
ing modelling complex three-dimensional devices quicker and easier. Furthermore, 
it is notable that some commercial packages are starting to offer more than one 
method within the same graphical user interface. A point that has been made often 
in this chapter, and will continue to be made frequently in this book, is that one 
should chose the appropriate method for the problem at hand; working within a 
consistent user interface, it will be far easier for users to exploit the full power of 
the CEM techniques available. 

Perhaps the most important trend will be the use of increasingly powerful com- 
mercial packages, and a decline in the number (or at least use) of CEM “freeware.” 
This reflects both the difficulty (and hence expense) of developing general-purpose 
CEM packages; unless government sponsored (such as NEC), the cost of develop- 
ing and maintaining code has to be recovered by licensing. Intimately connected 
to this, CEM developers should expect an increasing number of non-expert users 
of CEM tools (in much the same way that FEM analysis is now routinely taught 
to undergraduate civil and mechanical engineers, and routinely used in industrial 
design). Codes increasingly need to be robust, incorporating warnings of inappro- 
priate meshing etc. for users without an extensive post-graduate training in electro- 
magnetics. Electromagnetics remains a challenging discipline, and educating users 
of CEM tools, as well as making the tools more robust, will become increasingly 
important — it is hoped that this book will contribute to the former. 


1.11 A “road map” of this book 


This book comprises essentially three parts. The first part, Chapters 2 and 3, deals 
with the finite difference time domain method, in one and (primarily) two dimen- 
sions respectively. Chapter 2 uses a simple transmission line problem to intro- 
duce many of the basic ideas of the FDTD method. Chapter 3 goes on to extend 


26 An overview of computational electromagnetics 


these ideas to two dimensions, and considers a number of the issues raised when 
handling radiation and scattering in free space, in particular the use of absorbing 
boundary conditions. In this context, an example is given of a perfectly matched 
layer. The three-dimensional FDTD method is briefly discussed, and examples of 
the use of a commercial three-dimensional code are presented. These two chapters 
form an integrated unit. 

The second part, Chapters 4-8, deals with the method of moments. Here, the 
five chapters largely alternate theoretical devolopment with practical application. 
Chapters 4 and 5 form a unit, first introducing MoM theory for thin-wire antennas, 
and then applying it using both a commercial and a public domain code. Chap- 
ter 6, on modelling surfaces (and also volumes) using the MoM, is largely self 
contained. The material in Chapter 6 on the hybrid MoM/PO, as well as on high- 
performance computing and fast methods, could be omitted without interrupting 
the flow of the book on a first reading. Chapters 7 and 8 form a further unit on 
the theory and application of the Sommerfeld mixed potential integral equation 
approach to modelling stratified media (of which microstrip antennas are the most 
widely encountered application at radio frequencies). The material in Chapter 7 
is amongst the most theoretically challenging in the book, and could be omitted 
or covered only superficially, whilst still allowing time for some of the examples 
in Chapter 8 to be studied. Similar comments also apply of course to readers in 
industry whose prime focus is on using the MoM. 

The third and final part, Chapters 9 and 10, is devoted to the finite element 
method. Chapter 9 goes directly into two-dimensional vector element FEM theory; 
it is also used to illustrate the solution of an eigenvalue problem. The material in 
the last chapter, Chapter 10, is primarily to sensitize readers to more advanced 
formulations and applications (in this case, of the FEM). 

For a course on CEM methods, there is probably more material here than can be 
covered in a typical semester course, and instructors can be guided by the above 
discussion regarding what to omit. Some suggested exercises are also included in 
Appendix D. They are intended primarily for use in a formal classroom environ- 
ment, but would be useful for self-study as well. 

Regarding the other appendices: good antenna and electromagnetic texts usu- 
ally include material on vector calculus, and it is assumed that the reader has at 
least one, so repeating it here seems superfluous. Instead, the appendices contain 
material which the author has found useful specifically in CEM, and which is not 
easy to find in the literature. 

A final comment. Electromagnetics, antenna engineering and microwave circuit 
design are all extremely well-established fields, with excellent textbooks avail- 
able. This book is designed to complement, not compete, with them. It is a text 
specifically on the theory and applications of computional electromagnetics. It is 
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assumed that the reader has a suitable reference in his or her field of interest, so 
this book does not define antenna radiation patterns, S-parameters or other well 
known and widely used concepts in this field. For readers who would like to add 
to their libraries, the following can be highly recommended. On electromagnet- 
ics in general, a very comprehensive reference is [24]; for antenna engineering, 
[25], [26] or [27] are all excellent references, as is [28] for microwave circuits and 
systems. There are many older texts which would also of course be suitable; the 
above are highlighted since they are all currently in print and have almost all been 
recently revised. 
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The finite difference time domain method: 
a one-dimensional introduction 
David B. Davidson and James T. Aberle 


2.1 Introduction 


The finite difference time domain method, usually referred to as the FDTD, is a 
particular implementation of a general class of methods known as finite difference 
techniques. The FDTD is so widely used in the CEM community that although 
finite difference methods cover a wide spectrum of complexity and accuracy, it 
is the FDTD which is almost always implied in CEM when finite differences are 
mentioned. 

Finite difference methods are numerical methods in which derivatives are di- 
rectly approximated by finite difference quotients. The general class of such meth- 
ods is the most intuitive numerical approach, and was the first to be extensively 
developed by the scientific computing community. To this day, it probably remains 
the most universally applicable numerical technique and the one most widely used 
for scientific computation. As just discussed, for dynamic problems in CEM, the 
most popular is the FDTD method. The opening discussion in this chapter will 
discuss finite differences in general, before moving on to the specifics of the 
FDTD. 

At this point, a general comment about the philosophy underlying the mathe- 
matical treatment of the computational algorithms in this book would be in or- 
der. Although we endeavor not to be “sloppy” mathematically, the emphasis in 
this book is in presenting well-known methods for well-known problems in CEM, 
rather than on the basic mathematical requirements of the methods, as one would 
expect to find in an applied mathematics text, for instance. An example of the 
type of issue which we will gloss over, at least initially, is the differentiation of 
discontinuous functions, which requires the generalized (weak) derivative, prop- 
erly the field of functional analysis. Fortunately, the physics-based problems we 
are addressing usually do not evidence the type of pathological behavior which 
can (rightly so) concern mathematicians, and issues such as existence proofs will 
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generally be treated superficially, if at all, in this book. The reader should be 
cautioned about applying the methods discussed here in other fields of engi- 
neering or applied science without first mastering the underlying theory in more 
detail. 


2.2 An overview of finite differences 
2.2.1 The basic solution procedure 


The basic steps of any finite difference method can be summarized as follows. 


Divide the solution region into a grid of nodes. 

Approximate derivatives in the given partial differential equation by finite differences 
involving the value of the solution at various nodes. 

Solve the finite difference equations for the value of the solution at each node subject 


to boundary and/or initial conditions. If operating in the time domain, this amounts to 
finding the values at the next time-step. This process is variously called time marching, 
time integration, or specifically in the context of the FDTD, “leap-frogging.” 


The FDTD method, being a time domain approach, is an initial value method 
(although material boundaries are of course included). Finite difference methods 
in general can operate as either boundary value or initial value methods. 


2.2.2 Approximating derivatives using finite differences 


Central to all finite difference methods is the approximation of derivatives with 
finite differences. From the basic definition of the derivative of a function, various 
numerical approximations can be proposed. However, these are usually derived 
from a Taylor series expansion, since this provides a handle on the error. Depend- 


ing on whether the “next,” “previous,” or “central” nodes are involved, one obtains 
forward, backward or central differencing as follows: 


Forward difference formula for first derivative 


dU (x) U(w~+Ax)-—U(x) (Ax) dU 2 
ax °° Ax er age oe et) 


Backward difference formula for first derivative 


dU (x) _U(e) Ue Ax) | (Ax) d?U 


2 
dx Ax 2 dx? ck) Cm) 
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Central difference formula for first derivative 


dU (x) U(x + Ax) —U(w—Ax) (Ax)? d?U 
dx 2Ax Gy dx? 


These expressions are obtained by performing a Taylor series expansion of the 
function around x. Let us consider the derivation of the central difference formula. 


+ OAxe 623) 


The Taylor series expansion about xg, evaluated at x9 + Ax, is: 


aU (Ax)? a2U (Ax)? a3U 
U(xo + Ax) = U(x9) + Ax — — — 
OX gaa Pe: eee Gr. OR" |oa5, 
A 494 
eo’, (2.4) 
24 x4 | as 


€ is a point located in the interval (x9, x9 + Ax). This can alternatively be written 
as 


aU (Ax)? 8?U (Ax)? 03U 
U(xp + Ax) = U(x9) + Ax — — — 
Oe: | paoe 2 eed a aren 
(Ax)* a4U 5 
= O(A 25 
24 dx |, + O(Ax) (2.5) 


A similar expansion is performed about xo, evaluated at x9 — Ax: 


aU (Ax)? a?U (Ax)? 3U 
U (xo — Ax) = U(x0) — Ax — — am : 
5 ee De ONT | een GF 08? [yes 
(Ax)* a4U 
i ase |. POA (2.6) 
X=XQ 


Subtracting the two expressions, grouping terms, dividing by Ax and noting that 
the remaining terms in Ax cancel, we obtain Eq. (2.3). 


A mathematical aside — finite difference approximations of the second 
derivative 


If, instead of differencing Eqs. (2.5) and (2.6) as above, we add them, we obtain 
a formula for the central difference approximation of the second derivative of 
the function. The result, with remainder term of second order, is: 
d?U (x) _ U(x + Ax) — 2U (x) + ;U (x — Ax) 
dx2 (Ax)? 
(Ax)? d4*U 


Cara O(Ax)4 (2.7) 
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(Note that the terms in (Ax)? also cancel.) In the FDTD, we will not directly use 
this formula, but it turns out that the FDTD scheme is also second-order accurate. 
The reason is that the central difference formula for the second derivative can 
also be derived by combining the expressions for the forward and backward 
derivatives to first order, which is what the FDTD effectively does. 


Although Eqs. (2.1), (2.2) and (2.3) appear similar — indeed, the first part of each 
is identical — the remainder (error) terms are not. For both forward and backward 
differencing, the error is proportional to the cell length (Ax) (also known as a first- 
order scheme), but for central differencing, it is proportional to the square thereof, 
or alternatively, a second-order scheme.! Clearly, in the limit Ax — 0, the central 
difference formula will converge more rapidly to the true value of derivative. 

This idea of direct discretization of the derivatives underlies the FD method; one 
should rather view this as a class of methods, since there are a variety of choices 
which one can make with regard to the specific FD algorithm. Before moving 
onto the FDTD method, one last general point should be made: FD methods can 
be either implicit or explicit. This is particularly relevant when time is one of the 
variables. An implicit method requires the solution of a set of simultaneous equa- 
tions — a matrix equation — in order to evaluate the unknowns. (The resulting matrix 
is generally highly sparse, i.e. has only a few non-zero entries. Efficient FD solvers 
exploit this to save both memory and computational time.) From a physics view- 
point, with an explicit method, the “next” value at a point is a function not only of 
the current and past values at this and the surrounding points, but also the “next” 
values of some or all of these. In an explicit method, each unknown can be obtained 
directly in terms of given or previously computed values. Physically, the next value 
is computed entirely from current or past values. Explicit methods do not require 
any matrix solution. However, they usually have some maximum time-step size, 
which if exceeded, produces instability (generally known as the Courant limit). 

It should be noted that there are other methods for obtaining numerical deriva- 
tives. By using more points, higher-order schemes can be derived. However, the 
Yee scheme, to be discussed, does not readily accommodate these in general. 


2.3 A very brief history of the FDTD 


The FDTD is based on a particular FD scheme (Yee’s algorithm) that is applied 
to Maxwell’s curl equations in the time domain. It is an explicit marching-in-time 
procedure that simulates the propagation and interaction of electromagnetic waves 


! A reminder: if a function a(x) is said to be O(x”), then there exists some constant A such that o(x) < 
Ax. Vey 
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in a region of space. At present, the FDTD is probably the most popular numerical 
method for the solution of RF electrodynamic problems, due to its simplicity and 
generality. 

The algorithm was first proposed by Yee in 1966 [1]. For around a decade, 
the method attracted little, if any, attention; the computational electromagnetics 
community was primarily exploring the method of moments during this period. In 
1975, Taflove and Brodwin obtained the correct stability criteria, and computed 
sinusoidal steady-state solutions using the method. In 1977, Holland, Kunz and 
Lee applied the method to electromagnetic pulse problems. In 1981, Mur obtained 
the first numerically stable, second-order accurate absorbing boundary conditions. 
From then on, the popularity of the method grew in leaps and bounds, as a number 
of theoretical issues were solved in rapid succession, culminating in Berenger’s 
perfectly matched layer in 1994. The rapid adoption of the method was also due 
to the explosive growth in especially personal computing; in 1966, realistic ap- 
plications of the FDTD made what were then outrageous demands on contempo- 
rary computers, whereas those of the MoM were decidedly more modest; by the 
1990s, Moore’s law had ensured that many realistic FDTD simulations could be 
undertaken on a PC in minutes or at most hours. Hence both theoretical develop- 
ments and technological progress played crucial roles in the development of the 
method. 

Theoretical work on the FDTD continues to this day, although the main thrust of 
most work is now in terms of applications. A detailed chronology, with extensive 
references, may be found in [2, Section 1.5]. 


2.4 A one-dimensional introduction to the FDTD 
2.4.1 A one-dimensional model problem: a lossless transmission line 


To introduce the FDTD algorithm, we will consider a lossless transmission line 
problem. From basic transmission line theory, the reader will be aware that for 
transverse electromagnetic (TEM) modes, there is a one-to-one correspondence 
between electric fields and voltage, and magnetic fields and current. Hence in the 
following, although we use voltage and current, this is fully equivalent to a field 
description of a TEM transmission line. 


A reminder — TEM modes 


As noted in the main text, for fields which are entirely transverse electromag- 
netic in nature, there is a one-to-one correspondence between electric fields and 
voltage, and magnetic fields and current. The best known example is a coaxial 
line. If the voltage between the inner (radius a) and outer (radius b) is V, then the 
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radial electric field is mua ,, and with current J, the circumferential magnetic 
field is [/27r. The simplest example is the parallel plate waveguide, separa- 
tion d, at potential difference V, where the electric field is V/d and the surface 
current density and magnetic field are numerically equal, although orthogonal in 
space since Na —Ax H. 

For guiding structures supporting more complex modes, such as TE or TM, a 
correspondence may still be found but it is no longer unique. 


The well-known equivalent circuit of an infinitesimal piece of transmission line 
is shown in Fig. 2.1. Z is the inductance per unit length and C is the capacitance 
per unit length of the lossless transmission line. On this section of line, the voltage 
and current on the line are described by a pair of coupled first-order differential 
equations, frequently known as the telegraphist’s equations: 


a (zt) AV, 1) 


= -C ——— 2.8 

az ay (2.8) 
aV (z,t) al (z, t) 

——— =-L 29 

a2 re (2.9) 


As already noted, the transmission line equations are a special case of Maxwell’s 
curl equations in one dimension. 

At this stage, we could decouple the differential equations to obtain the wave 
equation (which is a second-order partial differential equation) for the voltage on 
the line as a function of position and time. (This is the approach generally taken in 
introductory electromagnetics texts; the result is the one-dimensional wave equa- 
tion, in either voltage or current.) However, we will instead work directly with the 
coupled pair of first-order equations. 

Consider the following transmission line circuit problem, illustrated in Fig. 2.2. 
Assume L = 1 H/m, C = 1 F/m, h=0.25 m, Rs = 1 2, and Ry =2 Q. (Note that 


I(z,0) YY YMG +AZD 
+o -: 
LAz 


V(z,t) ~~ CAz V(z + Az,t) 


- O - 


Figure 2.1 Infinitesimal section of a one-dimensional transmission line. 
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Rs 


é 
YOO) Ry 


z=0 z=h 


Figure 2.2 Model transmission line problem. 


this choice of L and C produces a characteristic impedance of | Q, and velocity of 
propagation of 1 m/s. Clearly, this is a normalized version of the actual problem; 
normalized equations such as these are quite frequently used in physics [3, 4].) 
The following, then, will be our model problem: 


The model 1D problem 


Determine the phasor representation of the steady-state response V (z) versus z for 
Vo(t) = cos (827) , t>0 (2.10) 
The boundary conditions (BCs) at z = 0 and z = hare: 


V (0, t) = Vo(t) — Rs I (0,1) (2.11) 
V (hit) = Ri I (h,t) (2.12) 


Take the initial conditions (ICs) to be: 


V (z,0) =1(z,0) =0 (2.13) 


A mathematical aside — classification of this problem 


This problem is a deterministic, interior problem controlled by a hyperbolic par- 
tial differential equation, with mixed boundary conditions. It is deterministic 
since there is a source. It is an interior problem since the domain lies inside the 
boundaries. The vector wave equation is a hyperbolic partial differential equa- 
tion, and the boundary conditions involve both voltage and current. 


The exact solution can of course be readily derived used standard transmission 
line theory. (This is extremely useful, since it will permit us to test the accuracy of 
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our solution.) Noting that the source is matched, the result is: 


Vys(z) = Vt (eee? - nerve) (2.14) 
where 
vt=1/2 (2.15) 
r=1/3 (2.16) 
B = 8x rad/m (2.17) 


Before moving on to the FDTD solution, it should be noted that the above solu- 
tion is the phasor — i.e. frequency domain — solution of the problem. The excitation 
is a single-frequency sinusoid, radial frequency w = 8z rad/s, or f = 4 Hz. Since 
the speed of propagation is 1 m/s, the phase constant/wavenumber is also 827 rad/m, 
and the wavelength is 1/4 m. The FDTD is a time domain solver, so we need to 
bear in mind that we are either going to have to convert the above solution into 
the time domain, or transform our FDTD solution into the frequency domain. The 
Fourier transform will of course provide the connection. We will also have to bear 
in mind that the above is the steady-state solution of the problem; there is also the 
transient part of the solution, which the FDTD solution is also going to include. 
We will discuss how to deal with these issues subsequently. 


2.4.2 FDTD solution of the one-dimensional lossless transmission 
line problem 


The first step in obtaining an FDTD solution is to set up a a regular grid in space 
and time. The points on this grid can be designated as (zx, t,) where 


Ze = (k — 1) Az, k =1,2,...,Nz (2.18) 
h 

Az = ——.,, N,>2 2.19 

z Mel we ( ) 

th = (n — 1)AT, n= 1,2,3,... (2.20) 
T 

At = ——., M>2 (2.21) 

M-1 


As noted in the introductory remarks of this chapter, additional grid points at 
half-time and half-space points are now also introduced. These additional points 
can be designated as (2441/2, tn+1/2) where 


Zk+1/2 = («-5) Az, ( al Oe 2 Peed | (2.22) 


1 
tn41/2 = ¢ = 5) At, n=1,2,3,... (2.23) 
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We shall compute V(z,t) at the points (zx,t,), and I(z,t) at the points 
(Zk-41/2s tn+1/2), i.e. the voltage and currents are computed at offset locations in 
space and also in time. We now have two two-dimensional arrays representing the 
voltage and current. In each array, a row represents the temporal evolution of the 
field at a particular point in space, and a column represents the spatial distribution 
of the field at a particular point in time. (This is very convenient in understanding 
the method, but we should note now that the FDTD generally stores only two or 
three rows of each array — we will see subsequently how this is possible.) 

To assist us in imposing the mixed BCs at z = 0 and z =h, two additional 
fictitious columns outside of the boundaries of the problem will be introduced, 
corresponding to: 


1 
Z1/2= 5h (2.24) 


1 
ZN,41/2 =h + iene (2.25) 


Similarly, to assist in the imposition of the the initial conditions at t = 0, an 
additional row will be introduced corresponding to: 


hj2= hg (2.26) 
2 
Figure 2.3 shows these grid points graphically. The “o” indicates points at which 
V(z, t) is computed, and “+” points at which J (z, t) is computed. (As drawn here, 
vertical cuts correspond to temporal evolution at a particular point in space, hori- 
zontal cuts to spatial distribution at a particular point in time.) 
The first transmission line equation, Eq. (2.8), is approximated at zx and t+1/2 
using central differencing in both space and time, i.e., 


n+1/2 n+1/2 
ol (zk, tn+1/2) y Tega. ~ Akay (2.27) 

dz (Az) 
OV (Zk. tn+1/2) yitt aye (2.28) 

at (At) , 

Thus, the update equation for V may be obtained as 

yrtl yn _ At potl2 _ pnt /2 (2.29) 

k oo We CAz k+1/2 k-1/2 ‘ 


This update equation for V may be represented schematically by the “computa- 
tional molecule” or “stencil” shown in Fig. 2.4. From this, it is clear that the update 
equation can be used for k = 2,..., NV, — 1 andn > 2. Special update equations 
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Grid for determining voltage on lossless line (Nz=11, M=16) 
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Figure 2.3 The Yee grid. 
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Figure 2.4 The voltage stencil. 


must be devised from the initial and boundary conditions to treat n = 1 and k = 1 
andk=N,. 

The second transmission line equation, Eq. (2.9), is approximated at z%41/2 and 
t, using central differencing in both space and time, i.e., 


OV (ze+1/2,tn) Vivi — Vee 


2.30 
az rae (2.30) 
n+1/2 n—1/2 
al (z ty I — I 
( k+1/2 ) ~ k+1/2 k+1/2 (2.31) 


ot ~ (At) 
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Figure 2.5 The current stencil. 


Thus, the update equation for 7 may be obtained as 


Ds = as = a (Vii — Vi’) (2.32) 
The update equation for 7 may be represented schematically as in Fig. 2.5. 
Again, it is clear that the update equation can be used for k = 1,..., N, — 1 and 
n > 2. Special update equations must be devised from the ICs to treat n = 1. How- 
ever, no special treatment for the boundaries at z=0 or z=/A needs to be imple- 
mented. 
The update equation for V must be modified at k = 1 and k = N, to incorporate 
the BCs into the solution. Consider the update equation for V atk = 1 andk = N;: 


At 
+1 +1/2 +1/2 
vitt = vi (gn? = AN) (2.33) 
At 
+1 +1/2 +1/2 
Vy = Va ae Caen = nip) (2.34) 
The values of [ 4 a 5 and i as must be obtained from the BCs. Consider the 
BC at z=0: : 
VO, tr) + Rs1O, tr) = Voltn) (2.35) 
Using 
VO, tm) = Vi (2.36) 
1 1/2 1/2 
1(0, tn) © 5 bes ey | (2.37) 


the discretized BC gives 


2, 2 
ett 2 atti? —_ yn — V(t 2.38 
1/2 3/2 Rs 1 + Rs 0( n) ( ) 
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A mathematical aside — semi-implicit approximations 


Equation (2.38) is sometimes known as a semi-implicit approximation, since 
the “next” value at point 1/2 also uses the “next” value at point 3/2. This is 
also used when conduction currents are included in a full-wave solver. Although 
widely and successfully used, this approximation can degrade both the stability 
and accuracy of the solver. 


Consider the BC at z=h: 


V(h, tn) — RLI(h, t) =0 (2.39) 
Using 
Vih, th) = Vn, (2.40) 
1 1/2 1/2 
T(h, th) © 5 bea + ii (2.41) 


the discretized BC gives 


1/2 1/2 2 
Ine = I-42 - Rs (2.42) 


Using the values of J ie * and I ee derived from the BCs, we obtain the update 
equations for V atk = 1 andk = N-: 


2At 2At 


RsC Az CAz 3/2 
2At 
Vo(tn) (2.43) 
RsC Az 
2At 2At 1/2 
Ve = (SS ee ee 2.44 
Ne ( zon) Ne + @qgiNe-W/2 oe) 


To start the FD scheme, we need to obtain the values of Ve and [ i ? fork = 
Dyiceiny Nez: 

The values of Vv may be obtained from the initial condition V (z, 0) = 0. Hence, 
Ve SO fore a5. 

The update equation for J for n = | is: 


3/2 __ 1/2 At 1 1 
Te 4/2 = Ti /2 = LAz (Vii = vi) 


= Ti (2.45) 
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The value of I i ie 2 must be obtained from the initial condition. Consider the initial 
condition: 


I (Z41/2,9) =0 (2.46) 


Clearly, J must be zeroed at all points at both time-step 1/2 and 3/2. 


Summary: FDTD scheme for the model problem 


In summary, the FD scheme for this problem is: 


vi =0, fork =1,...,N; (2.47) 
ion =0;.. torkS anya (2.48) 
Forn > 2, 
2At st 2At 1/2 
vr=(1— Vi 
: ( om) I CAz 3/2 
fectee ¥) (ty—1) (2.49) 
Rs CAz OVn-1 . 
NE ee . 
-1 1/2 1/2 
Vi avi} — Gaz (een —Rp), fork=2,...,N;—1 
(2.50) 
2At 2At 1/2 
vt =(1- Vy) + SIN 2.51 
Nz ( Zon) T CAg Nm? yon 
44/2 ={/2 At 
Re = Fey — im Vila» “for kelp Ne= 1 -(252) 


Programming aspects: avoiding half-steps 
Half-integer values are inconvenient to program. To avoid them, we can simply 
make the following changes: 
n+1/2—>n (2.53) 
k+1/2—>k (2.54) 
However, this is only a matter of notational convenience, the voltages and cur- 
rents are Still located at the relevant points and times, with half-offsets as ap- 


propriate! This must always be kept in mind. This also extends to both two- and 
three-dimensional FDTD solvers. 


Programming aspects: “in-place” operations 


A careful study of the update equations above shows that once all the next values 
of voltage (i.e. V;") have been obtained, the current values ve ) are never needed 
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again. (The update equation for current, viz. J eee , requires only V,’.) As such, 


it is usual practice in an FDTD code to overwrite ies with V," at the end of 
each time-step. Indeed, it can be done immediately on a point-by-point basis, but 
it is usually more convenient to do this all in one vector update. Hence only four 
vectors need be stored, two for voltage and two for current. If, for some reason, 
one wants the complete time history at a point (or plane or volume) then this is 
usually stored in a separate array. In signal processing, such operations are known 
as “in place” operations. 


Programming aspects: reducing the “operation count” 


It is also possible to reduce the number of operations per time-step (and reduce 
memory requirements in inhomogeneous problems) if the following change of 
variables is made: 


The algorithm becomes: 
Vi =0, fork =1,...,N- (2.56) 
1) =0, fork =1,...,N-—-—1 (2.57) 
Forn > 2, 
A Bas 2 2 
VI=(— BV — 2 + 2 Vote) (2.58) 
Ss 
Weave - (aT), fork=2,...,Ne-1 2.59) 
Vi = (1— Bo) Var! + art (2.60) 
ees oe r (Ven & vf), fork=1,...,N;—1 (2.61) 
2At 
= 2.62 
Bi RCs (2.62) 
ee 2At (2.63) 
o ReC Ag 
(Ar)? 
r= ——_, (2.64) 
LC(Az) 


Obtaining and evaluating preliminary results 


As commented in the opening of this chapter, the FDTD computes results in the 
time domain. The analytical (phasor — steady state) solution is in the frequency 
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Figure 2.6 Solution for N, = 11, M = 64, and € = 0.002. 


domain. As is well known from circuit theory, the response of a system is the 
superposition of the transient and steady-state responses. By Fourier transform- 
ing the solution at different times, and noting the change in the solution, we can 
effectively eliminate the transient part of the solution. An estimate of the phasor 
representation of the steady-state response is obtained at the end of each period 
(of the sinusoid of 4 Hz frequency, i.e. every 250 ms) by evaluating the Fourier 
coefficient at frequency w = 87 rad/s of the time domain data from the FDM so- 
lution. In computing the response, time domain data are stored for one period, and 
then overwritten. Steady state is taken to be achieved when the normalized RMS 
discrepancy between consecutive estimates is less than some positive error bound, 
1.e., Dims < € > 0. As a measure of accuracy, we evaluate the normalized RMS 
error of Yee’s algorithm with respect to the exact solution, viz. (Eyms). The result 
of this is given in Fig. 2.6. This particular solution required N = 6 for convergence 
with Ems = 0.0432. Note that since this is a phasor, the result is of course complex, 
with both real and imaginary parts. 


2.4.3 Accuracy, convergence, consistency and stability of the method 


For any numerical method, important questions which one must pose include the 
following. 
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Accuracy The degree to which the numerical solution to the approximate field 
problem approximates the exact solution to the approximate field problem. 


Consistency A finite difference equation is said to be consistent with a PDE pro- 
vided that the local discretization error tends to zero as the mesh density increases, 
or alternatively, the mesh increment decreases. This is another statement of con- 
vergence: the numerical solution should converge to the exact solution as the mesh 
is refined (for this FDTD problem, this implies Az — 0). 


Stability A process (e.g., a finite difference scheme) is said to be stable if and 
only if errors introduced at any stage in the process remain bounded throughout 
the entire evolution of the process. 


The Lax equivalency theorem states that if a finite difference scheme is consistent 
with a properly posed? linear problem, then stability is the necessary and sufficient 
condition for convergence. (Stability proofs can fortunately be readily obtained.) 

With regard to accuracy, we will first investigate this by numerical experimenta- 
tion. In short, we are verifying the FDTD scheme. We have discussed this topic in 
Chapter 1; it is so important that further comments are in order at this point. Right 
at the start, we must stress that we can only meaningfully talk about accuracy of 
our numerical model with respect to the field problem which we posed — what 
we defined as the approximate field problem in Chapter 1. This problem is almost 
always a simplified version of the real-world problem. For instance, in our trans- 
mission line model, we assume no loss; no matter how good our FDTD solution, 
if the transmission line we are modelling has significant loss, our solution cannot 
be an accurate simulation of the real problem. Hence, to verify a numerical model, 
results are often compared to a known analytical solution of the same approximate 
field problem. 

In practice of course, we want to use EM simulators to model the real world, 
and for this, comparison with measured data is highly desirable or even essential in 
many cases. Good agreement between measured data and numerically computed 
results indicates that all the important physics of the problem has been captured 
in the field description of the problem; the numerical approximation of the field 
problem is accurate and reliable; and (a point all too often overlooked) that reliable 
measurements on properly calibrated equipment have been made. 

In Fig. 2.7, the effect of decreasing the size of time-step (or as plotted, by equiv- 
alently increasing the number of time points per period) is investigated. In Fig 2.8, 


2A problem is said to be properly posed if: (i) a unique solution exists; (ii) the solution depends continuously on 
the initial and/or boundary conditions. 
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Figure 2.7 Normalized RMS error with exact solution versus number of time points per 
period (M) with N, = 11. Note that Yee’s algorithm is unstable in this case for M < 16. 
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Figure 2.8 Normalized RMS error with exact solution versus number of spatial points (V-) 
with M = 64. Note that Yee’s algorithm is unstable in this case for Nz > 45. 
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the effect of decreasing the size of spatial step is investigated (again, the plot shows 
the equivalent effect of increasing the number of spatial points along the length 
of the line). 

In both cases, one notes that stability imposes limits on the discretization. As 
already hinted at, with an explicit method such as the FDTD, there will be some 
maximum time-step size. As Fig. 2.7 implies, this will also be linked to spatial 
step size. The oft quoted stability criterion for Yee’s algorithm in one, two, or 
three dimensions is 

se < ae (2.65) 
As’ 4/n 
where As is the length of a side of a uniform cell and n is the number of space 
dimensions in the problem. For our one-dimensional problem, the above becomes 
r < lor 


— <1] (2.66) 


where u = 1/./LC is the velocity of propagation on the line. 

The above stability criterion is also called the Courant condition, and it can be 
derived using Von Neumann’s method applied to Yee’s algorithm. The essential 
idea is to discretize a known plane wave in the algorithm, and require that its 
amplitude remain bounded as time-stepping progresses. 

A physical interpretation of the Courant condition may be obtained by consid- 
ering both the numerical domain of dependence and the physical domain of depen- 
dence for an arbitrary point in the grid. This is illustrated in Fig. 2.9. The region 
within the solid lines is the numerical domain of dependence and the region within 
the dashed lines is the physical domain of dependence. The solid lines have slopes 
of magnitude Ar/Az and the dashed lines have slopes of magnitude 1/u. Yee’s 
algorithm is stable provided that the physical domain of dependence is contained 
within the numerical domain of dependence. If this is not the case, then grid points 
outside of the numerical domain of dependence should be influencing the solution 
but cannot. Hence, instability is the result. The physical domain of dependence 
is contained within the numerical domain of dependence provided 1/u > At/Az, 
which is the Courant condition. 

The Courant condition guarantees the stability of the basic update equations de- 
rived from the transmission line equations. However, it does not guarantee stability 
of the overall algorithm. Additional stability criteria exist for the update equations 
at the boundaries. Unfortunately, these are not usually known analytically, and nu- 
merical experimentation is often required. In practice, many FDTD simulations 
use either perfect electrical conductors (PECs) or absorbing boundary conditions 
on the exterior boundaries; the former simply zero the tangential fields, the latter 
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Figure 2.9 Physical interpretation of the Courant limit. 


aim to match the interior wave properties as far as possible. Hence, this is generally 
not as serious a problem as this example might lead one to believe. Nonetheless, it 
is a point worth bearing in mind. 

We have already seen that for our particular example, we experience instability 
for values of r greater than about 1/2 (r is the fraction of the Courant limit; r = 1 
implies one is at the limit). Consider the transmission line circuit problem shown in 
Fig. 2.10. Assume L = 1 H/m, C = 1 F/m, h =0.25 m, and Rz is allowed to vary. 
Figure 2.11 shows the number of periods required for convergence of the solution 
and normalized RMS error with the exact solution versus reflection coefficient at 
the load.? Computations were made with N, = 11, M=64, and « =0.002. The 
algorithm is found to be unstable for values of Rz equal to or less than about 
0.15 Q, in spite of the very small value of r = 0.0252. 

Further on the topic of stability, consider the normalized RMS error with the 
exact solution versus the number of periods used in the calculation for Rz = 200 Q, 
as show in Fig. 2.12. Note that in the context of a 1 (2 system, this load is almost an 
open circuit. So-called “late time instabilities” in Yee’s algorithm are rumored to 
manifest themselves when dealing with high Q structures — such as this example — 
that require a large number of time-steps for convergence to the steady state. These 
instabilities are usually attributed to the accumulation of round-off errors. 


3 For R L < 1, the value of Eyms is normalized by dividing by the maximum value of voltage on the line. 
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Figure 2.10 Transmission line circuit problem illustrating effect of load on stability. 
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Figure 2.11 Number of periods required for convergence of the solution and normalized 
RMS error with the exact solution versus reflection coefficient at the load. 


2.5 Obtaining wideband data using the FDTD 


The transmission line example we have discussed follows the same historical path 
as the first FDTD work, by using a single-frequency excitation, waiting for the 
transients to die out, and then using the Fourier transform to give the frequency 
domain solution. It also connects elegantly with phasor circuit theory as taught 
worldwide at undergraduate level. However, this is a very inefficient use of the 
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Figure 2.12 Illustration of late-time instabilities. 


FDTD. The frequency spectrum associated with an excitation can directly produce 
the desired system response using some elementary concepts from system theory. 
Given an input signal and its s(= j@) domain transform x(t) @ X (j@), a transfer 
function h(t) > H(jq@) and output signal y(t) <> Y(j@), we can find the transfer 
function as 


Y (jo) 
X(j) 


H(jo) = (2.67) 
In introductory courses in circuit theory, one may have been asked to measure 
a transfer function in the laboratory, using a signal generator and an oscilloscope, 
with one channel monitoring the input and the other the output; in this case, H (ja) 
has to be computed point by point across the required spectrum (a very painful 
process, not least since the signal generator needs to be continually re-set to a con- 
stant amplitude and phase as its frequency is changed, or these data must be noted 
for subsequent processing). What we have just done with our transmission line 
problem is the same, although done computationally. However, by using sources 
with more than just one frequency component, we can readily evaluate a num- 
ber of points simultaneously. Ideally, we would like signal containing all possible 
frequencies (the Dirac delta function, of course, with spectrum X (s) = 1); for rea- 
sons we will appreciate shortly, this is neither practical nor desirable in real FDTD 
code (although it is possible in the very special case of a 1D code running at the 
“magic time-step,” to be discussed subsequently). 
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Figure 2.13 A Gaussian pulse. 


Examples of wideband sources used in FDTD simulation include the follow- 
ing forms: Gaussian, Gaussian derivative, Rayleigh, chirp and wavelet pulses. The 
properties of the first two, perhaps the most popular in introductory FDTD work, 
are discussed in the following sections, as well as another interesting polynomial 
pulse. 


2.5.1 The Gaussian pulse 
The Gaussian pulse (Fig. 2.13) is popular in FDTD simulations: 


vo(t) = en (tm)? [20° (2.68) 


200 
It has the advantage of having an analytically known spectrum — one of the pecu- 
liarities of the Fourier transform is that the spectrum of a Gaussian pulse is also a 
Gaussian (Fig. 2.14): 


Vo(w) = e710 e-0707/2 (2.69) 


The energy contained in the pulse is also readily obtained: 


(oe) 1 (oe) 

E =i vo (t)dt = | |Vo(w) |? dw 

_ 20 J_oo 

1 

a (2.70) 


20./m 
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Figure 2.14 Spectrum of a Gaussian pulse. 


However, the Gaussian pulse has some significant disadvantages. The most im- 
portant are: 


e it exists for all time, including t < 0, 
e it has a strong frequency component at w= 0, i.e. DC. 


The former requires that the pulse be windowed at some time (i.e. set to zero) 
which means there is a slight discontinuity of switch-on. The latter is a more subtle 
point; it turns out the static (DC) component can cause problems with charge build- 
up in FDTD grids* and it is better to avoid strong DC spectral components in 
FDTD simulations. 


2.5.2 The Gaussian derivative pulse 


A simple variant on the Gaussian, namely its derivative (Fig. 2.15), is also very 
popular in FDTD simulations, since it removes the DC component. It is defined as 
follows: 


vo(t) = Bee Ge! etm) [20° 


2.71 
fan ee (2.71) 


4 Showing this is beyond the scope of this introductory discussion — for a detailed analysis, refer to [5]. 
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Figure 2.15 A Gaussian derivative pulse. 


The spectrum of the Gaussian derivative pulse is (Fig. 2.16): 


Vo(w) = joe fiom eo? /2 


The energy of a Gaussian derivative pulse is also easily computed: 


CO 1 CO 
E = Vo (t)dt = | |Vo(@)|* do 
a 27 Joo 


1 
~ 403./r 


2.5.3 A polynomial pulse 


(2.72) 


(2.73) 


A pulse with finite support and interesting properties is the following, of quartic 


polynomial form: 


_fa-ry* vidsl 
LO = {\ otherwise 


(2.74) 


This pulse does not appear to have a specific name. Its derivative has the important 
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Figure 2.16 Spectrum of a Gaussian derivative pulse. 


property of also being zero at |t| = 1: 


df(t)  {-811-?1)3 Vie) <1 
dt  |0 otherwise (2.75) 


The first derivative has no DC content. 
Interestingly, its second derivative has the same zero property at |t| = 1: 


df(t) — [48721 — 22)? -80 —22)3 vit] <1 
dt2 ~—«|0 otherwise (2.76) 


Thus, the pulse has extremely smooth switch-on and switch-off characteristics, 
with the pulse and both its first and second derivatives all being zero at |t| = 1. 
These properties are clearly visible in Fig. 2.17. (Although not shown, this property 
even extends to the third derivative.) 

By replacing f¢ with 


t(t) =1—2(t/T) (2.77) 


in the above, a pulse is obtained with switch-on time t = 0 and duration 7. 
The Fourier transform of these pulses must be computed numerically. 
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Figure 2.17 The (1 — t7)* pulse, and its first and second derivatives. 


This pulse is especially suitable for use as a windowed sinusoid (continuous 
wave): 


Fow =[1—r(t)7]* sin[m(2rt/T)] (2.78) 


with integer m controlling the number of cycles in the pulse. Clearly, m = 1 cor- 
responds to one cycle only, since the windowing function is non-zero only in the 
interval t = [0, T]. An example of a ten cycle windowed sinsusoid is shown in 
Fig. 2.18. 

This specific pulse, and its use as a window, appear to have been introduced 


in [6], although windowed sinusoids have been quite widely used in FDTD 
analysis. 


2.5.4 The 1D transmission line revisited from a wideband perspective 


We will now revisit our model 1D transmission line problem, and pose a slightly 
different question. 

Find the frequency response Vz, (@) / Vo(@) of the transmission line circuit shown 
in Fig. 2.19 from 0 < wm < 167 rad/s. Assume L = 1 H/m, C = 1 F/m, h = 0.25 m, 
Rs =0.5 Q, and Ry =2Q. 
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Figure 2.19 One-dimensional transmission line. 


Using standard frequency domain transmission line analysis, we can obtain an 
exact solution for this problem as 


Vi (@) _ Zo 14+T, 


a 2.79 
Vo(@) Zot Rs elPh — T_T se7iBh ea 
where 
Rea Zs 
LS 2.80 
L Rez (2.80) 
Rea7 
ic (2.81) 


Rs + Zo 


56 The FDTD method: a ID introduction 


Zoe ye (2.82) 
G 


B=aVLC (2.83) 


To obtain the transfer function of the circuit using a Gaussian pulse source, we 


need to do the following. 


Set the bandwidth of the source to be wide enough to cover the frequencies of interest 
by choosing the standard deviation of the frequency spectrum of the source to be equal 
to @max, the maximum radian frequency of interest, i.e. 


L/o = Wax (2.84) 


In the results to be shown, Wmax = 167 was used. This is sufficient to demonstrate the 
behavior of the transfer function over frequency, as well as the dispersive nature of the 
FDTD algorithm (more on this subsequently). 

Set the mean value of the time domain source function to be equal to four standard 
deviations so that the source can safely be assumed to be zero for t < 0, i.e., 


m = 40 = 4/Omax (2.85) 
Choose a space step such that 
Amin 
Az < 2.86 
starr (2.86) 
or equivalently, 
a (2.87) 


~ S@maxv¥ LC 


Choose a time-step which satisfies both the stability criterion for Yee’s algorithm 
(Courant condition) and the required Nyquist sampling rate for the highest frequency 
in the pulse. 


AE Salt (asvze. 7 ) (2.88) 
max 

where we assumed that the highest frequency in the pulse is 4@max. (All finite-time 
sources have a theoretically infinite spectrum; we have to decide some reasonable upper 
limit on the spectrum. Recall that the Nyquist theorem states that a signal with maximum 
frequency content f,, must be sampled at at least twice this frequency, i.e. At = 1/2 fi, 
or At = 1/@ in terms of radian frequency. In this case, we chose @, = 4@max. Be 
careful not to confuse @max, the maximum frequency of interest, with w,,, the maximum 
frequency present in the simulation!) Remember that the Courant condition does not 
guarantee the stability of the update equations at the boundaries, as the inequality in 
Eq. (2.88) reminds us. 

Use the FDTD update equations to let the system evolve during the source “on” time, 
which can be taken to be O < t < m+4o. At the end of this time, compute the total 
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energy of the source as” 


m+4o 
E source = if vp (t) dt (2.89) 
0 


Also, compute the Fourier transform of the response and the first estimate of its total 
energy as 


[o,@) 
E = / IVi(w) |? dw (2.90) 
—cC 
Allow the time evolution of the system to proceed. Periodically interrupt the time evo- 
lution to compute the Fourier transform of the response and a new estimate of the total 
energy of the response. Stop the time evolution of the system when the difference be- 
tween the Kth and (K — 1)th estimates of the total energy of the response normalized 
to the total energy of the source are less than or equal to some positive error bound, 
Le., 


(K) _ (K=1) 
E, = E, 


<e>0 (2.91) 


Esource 


and the total energy of the response is greater than some small fraction of the total energy 
of the source. 


2.5.5 Estimating the Fourier transform 


The Fourier transform X (@) of a time domain signal x(t), for angular frequencies 


@ = 2rf is defined as 


(oe) 
X(w) = / x(the J“ dt (2.92) 
—0o 
and inverse transform 
1 ¢&% . 
x(t) = | X(f) el? da (2.93) 
20 Joo 
The pair are also often written as 
i) * 
X(f) = / X(t)e JF" at (2.94) 
—cCo 
and inverse transform 
ee) p 
x(t) = / X(f) el" df (2.95) 
—cCo 


5 This can be done conveniently in MATLAB using the trapz function. 
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We will approximate the Fourier transform using the discrete Fourier transform 
(DFT) defined by 
N 
XOrS > ne ee, 1<k<N (2.96) 
n=1 

Signal processing experts sometimes view the two as entirely different trans- 
forms, and indeed, there are significant differences: the Fourier transform is de- 
fined for aperiodic signals, whereas the DFT automatically renders the signals pe- 
riodic (at the Nyquist frequency); the Fourier transform is continuous, the DFT is 
discrete. However, we can very usefully approximate the Fourier transform with 
the DFT if we bear this in mind, ensure that we satisfy the sampling theorem and 
note that the DFT as defined above is missing the correct normalization. By replac- 
ing the infinite limits in Eq. (2.92) with 0 and T = N At, and then approximating 
the integral as a finite sum with At = T/N, we see that the DFT approximates 
the Fourier transform, but with a At scale factor missing, and also with the signal 
repeated with period 7. 

The DFT can be confusing when first used in this context, since the DC com- 
ponent is not in the middle as one might expect, but is rather the first component 
k = 1. Some definitions of the DFT include a 1/N scaling factor in the forward 
transform; other include this in the inverse transform. The DFT implementation in 
MATLAB (f££t) uses the latter convention. The DFT yields N discrete frequency 
samples with a spacing Af = 1/T = 1/NAt. The number of samples N is usu- 
ally taken to be a power of 2 (also sometimes known as radix-2) so that efficient 
algorithms, specifically the FFT, can be used to compute the DFT.° For an even 
number of samples JN, the actual frequencies are defined as: 


Sk = (k — INAS (2.97) 
fork =1,2,...,N/2 and 
Sk = (k —N —1)AF (2.98) 


fork=N/2+1,..., N (the negative frequencies). The frequency atk = N/2+ 1 
(which can equally validly be viewed as a positive frequency) =TA f is also 
known as the folding frequency or the Nyquist frequency, and the Fourier trans- 
form is symmetric about this. 


An aside — gaining confidence with the DFT (and FFT) 


Despite undergraduate exposure, the DFT can remain rather mysterious to many 
students. One way to gain confidence with the DFT is to Fourier transform 


6 In Chapter 6, the fast Fourier transform (FFT) is discussed in some detail. 
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simple signals, whose transforms are known. Consider a cosine signal of an- 
gular frequency 1 rad/s. Its period is 27 s, and its frequency 1/(27) Hz. (From 
elementary courses on signal theory, it will be recalled that its Fourier transform 
is w{d(w + 1) + 6(@ — 1)}.) Let us take eight samples over one period (remem- 
ber that we must take more than two to satisfy the sampling theorem!). These 
should be equally spaced from ¢t = 0 to t = (7/8)2z. (Including the sample at 
t = 2m would be incorrect, since this point has already been included at t = 0.) 
This can be achieved very simply in MATLAB by using the command: 
t=linspace (0, (7/8) *2*pi,8). 
Now we create the cosine signal: 
x=cos (t) 
and apply the DFT G@mplemented as the FFT) to this: 
X=fft (x). 
The result is the following vector: 
X=[0 400000 4] 
Inserting the At = 7/N scale factor, with T = 27 and N = 8 in this case, 
which MATLAB omits, this vector is 
X=pi*[0 10000 0 1] 
and we immediately recognize the positive frequency component X(k = 1) at 
fi = 1/@z), and negative frequency component X(k = 8) at fg = —1/(27). 
(Note that the FFT is complex, but by choosing a signal with even time symme- 
try, only the real parts of the Fourier transform are non-zero.) 

Since most of our applications of the Fourier transform will be in computing 
ratios of spectra, the constants are not of great concern, but should be included 
for completeness. 


The highest (non-aliased) frequency in the spectrum produced by the FFT is 
Fimax = 1/2At and the frequency points are spaced by Af =1/NAt. Additional 
frequency points (i.e., smaller values of Af) can be obtained by zero-padding of 
the time domain data. The spectrum obtained by zero-padding of the time domain 
data is equivalent to that obtained by sinc-interpolation of the frequency domain 
data. (As an aside, we note that zero-padding to improve frequency resolution 
is a questionable practice, since no additional real data have been added to the 
system.) 

To compare the FDTD solution to the exact solution, define the normalized RMS 
error with respect to the exact solution as 


(FDTD) (exact) 2 
1 V (a) V (@,;) 
Evms — N 2 EDID a L af (2.99) 


Vor) Vg? Wor) 
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Figure 2.20 Generator and load voltages in the time domain for a Gaussian pulse. 


2.5.6 Simulation using Gaussian and Gaussian derivative pulses 


The FDTD solution with Gaussian pulse excitation required 694 total time-steps 
for convergence and results in Ems = 0.189. Results are shown in Figs. 2.20, 2.21 
and 2.22. The “ringing” in Fig. 2.20 is characteristic in telecommunications theory 
of a wideband signal on a dispersive channel, and we will see shortly that the 
FDTD indeed has dispersive properties. 

The FDTD solution with Gaussian derivative pulse excitation requires 820 total 
time-steps for convergence and results in E;yms =0.190. The generator and load 
voltages in the time domain for a Gaussian derivative pulse are shown in Fig. 2.23. 


2.6 Numerical dispersion in FDTD simulations 
2.6.1 Dispersion 


Dispersion is the phenomenon of signal distortion caused by the dependence of 
phase velocity (vp) on frequency. In a dispersive medium, either € or ju or both 
are frequency dependent. The resulting dispersion is called natural dispersion. In 
general, normal dispersion occurs when dv,/dw < 0 and anomalous dispersion 
occurs when dv, /dw > 0. Numerical solutions (such as the FDTD) can also in- 
troduce numerical dispersion — we will return to this shortly. 
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Figure 2.21 Magnitude of the transfer functions for exact and FDTD solutions. 
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Figure 2.22 Phase of the transfer functions for exact and FDTD solutions. 
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Figure 2.23 Generator and load voltages for a Gaussian derivative pulse. 


The phase velocity in a medium is given by: 


= (2.100) 


A dispersion relation describes the relationship between and w. In a distortion- 
less medium, B is a linear function of w and hence the phase velocity is constant 
with frequency. The signal (group) velocity is the velocity with which the signal 
(i.e., information) moves. It is the signal velocity which can never exceed the speed 
of light in a vacuum. The signal velocity can be computed as: 


h 


— 2.101 
7, (2.101) 


Ug 


where Ty is the delay time experienced by the signal in traveling over the distance 
h. 

Assume that the signal at z = 0 is given by vo(f), and that the signal at z = h is 
given by v,(t). The delay time in traveling from z = 0 to z = hh (Ty) is the value 
of t which maximizes the cross-correlation between vz (ft) and vo(t), 


xL(t) = [- vi (t) vo(t — tT) dt (2.102) 


In a distortionless medium, the group velocity is equal to the phase velocity. 
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Figure 2.24 Theoretical generator and load voltages in the time domain for a Gaussian 
pulse. 


A numerical algorithm can introduce numerical dispersion, even when waves 
are propagating in a distortionless medium. Yee’s FDTD algorithm causes numer- 
ical dispersion. We will illustrate this by comparing theoretical and FDTD results 
for our simple transmission line circuit shown in Fig. 2.19. Assume L = 1 H/m, 
C=1 F/m, h=0.5 m, Rs = 1 , and Ry = 1 2. This transmission line is distor- 
tionless with 6 = w/LC. The phase velocity on the line (v p) 1s a constant versus 
frequency and is equal to 1 m/s. The source and load impedances are equal to the 
characteristic impedance of the line. Hence, there are no reflections at either end 
of the line. The theoretical generator and load voltages in the time domain for the 
following Gaussian pulse excitation are shown in Fig. 2.24 


uL(t) = v0 (t — 0.5) (2.103) 


Compare these with the results computed in the time domain using the FDTD 
shown in Fig. 2.25. The ringing clearly visible on the load voltage is the result of 
numerical dispersion. 


2.6.2 Derivation of the dispersion equation 


To obtain the numerical dispersion relation resulting from Yee’s algorithm, we as- 
sume monochromatic plane-wave trial solutions. Substituting these trial solutions 
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Figure 2.25 FDTD generator and load voltages in the time domain for a Gaussian pulse. 


into the update equations and performing some straightforward algebraic manipu- 
lations yields the numerical dispersion relation. The procedure is as follows. 

Firstly, assume trial solutions of plane-wave form. In continuous space-time, a z- 
propagating plane wave has the form e/®'e~/#<, In discretized form, and allowing 
for arbitrary amplitude, this becomes: 


Ve 1A ejonat e JBKAz (2.104) 
A similar equation can be written for the discretized current: 
i = B elon /2Dat e IPKHI/2) Az (2.105) 
noting the offset between voltage and current. 


These are now substituted into the update equations (2.29) and (2.32) to obtain 
the expression for the next time-step: 


Atp 
cAz 
g (om el@At/2pg—JB(RH1/2)0z _ pee) (2.106) 


yee = A eionAt e JBKAz = 


Obviously, the last exponential term can be simplified as a sinusoid. 
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The crucial step in the derivation is to recognize that the discretized plane wave 
can also be written as: 


yt — A elolntlAt y—jokAz (2.107) 


Since these two equations represent the same wave (albeit via the FDTD up- 
date and the analytical solution respectively), we can equate them. Thus, equating 
Eqs. (2.106) and (2.107), noting that for a plane wave the ratio of voltage to current 
is Zo = /L/C, and simplifying Eq. (2.106), we obtain the dispersion equation: 


; () At ; () 0 (2.108) 
sin — sin {| —— ) = : 
ps VLC Az 2 


In the limit as Az — 0 (and thus, from the Courant limit, At — 0), the small 
argument approximation (Taylor series expansion) of the sine function can be ap- 
plied, and the expression becomes the exact (dispersionless) relation for the trans- 


mission line. This is important, because it indicates that dispersion in an FDTD 
mesh can be controlled by making the mesh sufficiently fine. This is a general re- 
sult, and applies in 2D and 3D (although the dispersion equation is more complex, 
of course). 


2.6.3 Some closing comments on dispersion in FDTD grids 


Given w, L, C, At, and Az, the above non-linear equation can be solved numeri- 
cally for 6, allowing us to determine the phase velocity as a function of frequency. 
This is shown graphically in Fig. 2.26. 

The exact group velocity is ue = 1 m/s, and the group velocity resulting from 
using Yee’s algorithm varies over the range of frequencies simulated in our model 
problem from this value to around 0.984 m/s, a difference exceeding 10%. 

As a closing comment on the subject of dispersion, it is interesting to note, from 
Eq. (2.108), that if the FDTD simulation is run at the Courant limit, viz. At = 
Az/c, with c = 1/ ./LC, the term in front of the second sinusoid becomes unity, 
hence the sinusoids are equal and hence their arguments, thus w/B = Az/At = v, 
in other words, there is no dispersion. This is also sometimes known as the “magic” 
time-step. This implies that an FDTD simulation run at this time-step can (in theory 
at least) handle Dirac delta functions (of infinitely wide bandwidth). Unfortunately, 
this does not extend to two or three dimensions, and is thus just a curiosity of no 
practical value. In 2D and 3D, it turns out that dispersion is minimized (but not 
eliminated) by operating at the Courant limit. FDTD beginners often run their 
codes well below the Courant limit, believing that their results will be better with 
a smaller time-step, but due to numerical dispersion, this is not the case. 

We can summarize this rather counter-intuitive fact as follows: FDTD codes 
should be run as close to the Courant limit as possible. It should also be noted that 
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Figure 2.26 Phase velocity as a function of frequency. 


numerical dispersion is frequency dependent, and worsens rapidly above a certain 
frequency. As such, when using a wideband source, we should be careful to ensure 
that we use a source whose spectrum does not have significant frequency content in 
this region. This is where rules-of-thumb such as 10 cells per wavelength criteria 
used earlier in this chapter arise; we appreciate here that the concept of “wave- 
length” is rather nebulous in the case of a wideband simulation, and we should 
rather interpret this as the wavelength corresponding to the maximum frequency 
of interest — often chosen as the point where the spectrum of the source is 1 /e of its 
maximum value (this is —8.6859 dB; —10 dB is also sometimes used). It must be 
appreciated that these are guidelines rather than exact rules. It should also be ap- 
preciated that these rules arose in an era when structures being simulated where at 
most a wavelength or two in size; for larger structures, as can now be undertaken, 
a finer discretization is required since dispersion accumulates over the length of 
the simulation. 


2.7 Conclusion 


In this chapter, we have used a very simple one-dimensional transmission line 
example to introduce the FDTD algorithm. We have seen from first principles how 
to derive the update equations; this has also given us a handle of the accuracy 
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of the method. Due to its second-order nature, the Yee algorithm is surprisingly 
accurate. The important issue of stability has been discussed, and we have seen 
that the Courant stability criterion is a necessary, but not sufficient, condition for 
stability — the boundary conditions can also cause instabilities, although as we 
have commented, in most FDTD simulations, this is not usually a major cause of 
concern. 

Although the FDTD method can be used in the frequency domain, by simply 
waiting for the transients to die out — and indeed, our first example did just that — 
this is an inefficient use of the method, which is capable of generating wideband 
data in one run. This has been discussed in depth in this chapter. 

Finally, the fact that the FDTD method has numerical dispersion has been dis- 
cussed, as well as the implications. Importantly, and perhaps counter-intuitively, 
FDTD codes should be run as close to the stability limit as possible to minimize 
dispersion. 

With some very simple substitutions, one can solve one-dimensional TEM field 
problems using the same theory that we have introduced. However, we prefer now 
to move into two dimensions, and immediately address field problems there. This 
is the topic of the next chapter. 
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The finite difference time domain method in two 
and three dimensions 


3.1 Introduction 


In the previous chapter, the basic concepts of the finite difference time domain 


method were introduced via a one-dimensional example. We will briefly reprise 


the issues one must attend to when doing an FDTD simulation, as follows. 


An FDTD mesh (or grid) must be created for the problem. (This is trivial in 1D, requires 
a little thought in 2D, and becomes quite a major problem in 3D.) 

This mesh must be fine enough — i.e. As must be no more than perhaps one-tenth of the 
minimum wavelength (i.e. maximum frequency) of interest (As represents the spatial 
step size; quite often, Ax, Ay and Az are chosen equal and As is used as shorthand for 
this). 

The time step At must satisfy the Courant limit (but be as close to this as possible to 
minimize dispersion). 

Boundary conditions (the source and load resistors in our 1D example) must be 
specified. 

An appropriate signal shape (e.g. differentiated Gaussian) with suitable time duration 
for the desired spectral content must be chosen. Also, in general, its spatial position 
must be specified. (In the transmission line example, it was fixed as the source voltage 
generator.) 


In this chapter, we will study the FDTD method in two and three dimen- 


sions. Firstly, we will develop a 2D simulator for a problem of scattering in free 


space. Following this, a very important development, the perfectly matched layer 
absorbing boundary condition, will be discussed and implemented. This is fol- 
lowed by a brief discussion of the extension to three dimensions. We conclude 
the chapter with a discussion of the use of CST MICROWAVE STUDIO™, 
a commercial electromagnetics simulation package which includes an FDTD 


solver. 
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3.2 The 2D FDTD algorithm 


We will now apply these ideas to a free-space scattering problem in two dimen- 
sions. Firstly, we remind the reader that although the real world is obviously three 
dimensional, many useful problems can be solved when one of the dimensions is 
much longer than the other two. In this case, we generally assume that the field 
solution does not vary in this dimension — often arbitrarily chosen to be the z- 
direction, which allows us to simplify the analysis greatly. (A note: assuming that 
there is no variation in z, for instance, does not preclude Z-directed fields; this 
point can sometimes cause confusion.) In electromagnetics, this assumption per- 
mits us to decouple the Maxwell equations into two sets of fields or modes, as 
they are often called: transverse magnetic and transverse electric.! Any field sub- 
ject to the assumption of no variation in z can be written as the sum of these 
modes: 


Transverse magnetic TM, often written TM,, modes contain the following field 
components: E-(x, y,t), Hy(x, y,t) and Hy(x, y, f). 


Transverse electric TE, often written TE,, modes contain the following field com- 
ponents: H,(x, y,t), Ex(x, y,t) and Ey(x, y,f). 


At the risk of repetition, there is no z variation in any of the above fields. 


3.2.1 Electromagnetic scattering problems 


When an electromagnetic field encounters a target,” currents are excited on it, 
which in turn re-radiate. This process is called “electromagnetic scattering.” Obvi- 
ous applications are in radar, and also in multi-path analysis for radio-wave prop- 
agation. Since the Maxwell equations are linear, the fields are often decomposed 
into an incident field Ei and a scattered field E%*'. The overall field, called the 
total field E', is then: 


Eo = Eine a Eat (3.1) 


By definition, the incident field is the field which would exist if the scatterer were 
absent. This is very useful; often, this will be a plane wave which can easily be 
expressed mathematically in closed form. We will see shortly how useful this idea 
can be when studying scattering. 


! Readers who have previously studied waveguide analysis will immediately recognize these concepts. 
2 Because most of the original work was done for radar applications, the military term “target” is frequently used 
for describing the scatterer in such circumstances. 
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3.2.2 The TE, formulation 


At this stage, we could solve either (or both) transverse modes; the FDTD process 
is essentially identical. We will chose the TE, formulation, because TE, waves 
exhibit interesting behavior when scattering off circular targets — creeping waves 
are excited on the structure, i.e. a wave “attaches” itself to the cylinder, goes around 
the target and then comes back towards the source, potentially in or out of phase 
with the incident field. (TM, waves do not do this; the reason is that the boundary 
conditions are different.) 

The TE, mode set is described by the following parts of Maxwell’s equa- 


tions: 

Ex 1 
p =- (= — ox) (3.2) 
ot é \ day 
OE, 1 0H, 

= — —oE 3.3 
até ( Ox ») ov 
dH, 1 [se —) (3.4) 
or =o \ Oy Ox 


We will simplify these further by assuming that the materials are lossless: 


dE, 10H 
ee (3.5) 

ot € Oy 
dE 1 0H. 
—* = --—— (3.6) 
ot E Ox 
dH, 1/(0E, dE 

a= aes (3.7) 
ot bw \ doy ax 


In the transmission line case of the previous chapter it will be recalled that we 
chose “half-step” increments for the current. We will apply the same idea to de- 
veloping a 2D FDTD solution of the above equations. We will make the following 
choices: 


x = (i — 1)Ax, £1 2} Ng 


% 
Ax=———, Ny >2 3.8 
PS Neal ee (3.8) 
ye = G — 1)Ay, fk ere 

Y 
cae » Ny? (3.9) 
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th = (n — 1)At, We 152,335 465 
T 


At = ——_., 
M-1 


M>2 (3.10) 
Here, X and Y are the dimensions of the region we will be gridding (in the x and 
y directions) and N, and Ny obviously are the number of cells in each dimension. 
It is traditional, but certainly not essential, to associate the indices i, j and k in an 
FDTD code with x, y and z, and m orn with ¢. 


Coding hints — the indices i, j; and MATLAB 


At this point, it is worth sounding a warning that using these traditional indices 
can cause very frustrating problems in MATLAB, where i, and j, are usually 
defined as /—1. A useful programming habit to develop is instead to use ii and 
jj as indices. 


A similar array of half-index points will also be defined: 


tiGe=aw DAG, -PSd aN 3.11) 
yyjtij2 = YG — 1/2)Ay, PS Mee g ly (3.12) 
tnt1/2 = (n —1/2)At, nee 1, 2. 3c (3.13) 


Following Yee’s choice, we will locate H(i, j,n) at xj41/2; ¥j+1/2; t+1/2- 
Ex (i, j,n) will be located at xj41/23 yj; tm and Ey(i, j,n) at xj; yj+1/23 tn. This 
choice is far from random; it provides a spatial grid with the magnetic field H, sur- 
rounded in space by the electric fields — E, (i, j,k) and Ey (i, j + 1,k), Ey@, j,k) 
and Ey(i + 1, j,k) — and offset in time by Ar/2. The spatial locations are indi- 
cated in Fig. 3.1. 

Now we will turn our attention to the discretization of the FDTD TE, modes. 
Consider Eq. (3.7). We apply central differencing to both time and space, produc- 
ing: 


Hit 5,j+5,n+4)-Hi+5,7+4,n—- 4) 


At 
1 | ExG@+5,j7+1,n)- Ex + 4, in) 
LL Ay 


(3.14) 


1| Ey@+1,j+5,n)- Ey, j+4,n) 
Ax 
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Figure 3.1 The Yee grid for the i, jth cell for the FDTD 2D TE, mode. 


=> 


Now, keeping only H,(i + 5. jt 5 n+ 5) on the left-hand side of the equa- 
tion, we rewrite this as: 


Se de aes st a 1 are aren 1 
Ay. tao a SS ie yt 


2 2 2 2 2 2 
+ len (i4 si thn) Ee (i+5.i0)| 
pwAy 2 2 
- [5 (i+ustin)-8 (wi+5.")| (3.15) 
wAx 2 2 


Similar procedures, applied to Eqs. (3.5) and (3.6), produce the update equations 
for the E-field components: 


"ok ee oe At oe. ! 
Ex{it>5.jnt1 =Ex\it5. jn Papo gt oe es 


ee as ae (3.16) 
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1 1 1 
—-Hz{i-xz,j+-, = 3.17 
z (: 5) Lr 5) n+ 5) ( ) 
Just as in the 1D case, the half space and time increments are inconvenient to 
program, and we will refer simply to i, j,n for Ex, Ey and H,, but keeping in 
mind the actual locations. We will also assume Ax = Ay = As. This allows us 
the simplify the above to the following: 


Hei, j,n) = AzG, j in V+ SE, (Gi, j +1,n) 
= Eh jE km) — BC+ (3.18) 


Exli, jpn +) = Exli, ism) + ~ [H-(i, jn) H,G,j—1,n)] (19) 


Pere eSEe pn) = a A Heli, im) — He - 1, jm G20) 


Note that when the electric fields are updated, the magnetic field values used are 
the newly updated ones. 
We now have our update equations, and the Courant limit for two dimensions: 


At < — (3.21) 


where c is the (largest) speed of light in the FDTD region (in non-vacuum regions, 
the speed of light is of course slowed). We are not quite ready to program, however. 
There are two things we still need to consider: injecting a source, and terminating 
the mesh. 


3.2.3 Including a source: the scattered/total field formulation 


If we want to study scattering, we need a method for simulating a plane wave. 
(Usually, scattering problems assume that whatever source setup the incident field 
is far removed from the scatterer, and hence the field incident on the target is a 
uniform plane wave.) The simplest method for doing this is to exploit the con- 
cepts of incident, scattered and total fields introduced in Section 3.2.1. Since the 
Maxwell equations are linear, and we will only work with linear materials here, 
we can use the FDTD to solve for either the scattered or total fields. (Remember 
that the incident field is assumed known in this type of formulation.) We will split 
the computational area into two zones using a (non-physical) line, as in Fig. 3.2: 
in one region, we will have only scattered fields, and in the other, total fields. For 
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Figure 3.2 The scatterer and surrounding FDTD zones, showing scattered field and total 
field regions. 


convenience, we will choose a constant x coordinate; we will assume this corre- 
sponds to index iz = L (the subscript L short for left; we could also position 
another scattered field zone to the right of the scatterer, etc.). 

Now we note one of the points which can sometimes cause problems with the 
FDTD algorithm. Do we interpret iy, as being on a spatial step or half-step? There 
is no correct answer to this, we need to make a decision and then work consistently 
with this. Since three of the five field components in the two-dimensional Yee cell 
are located at half-step values of x, let us choose this. Hence our scattered/total 
field demarcation is located at x, = (L — 5)A. Fields located on and to the right 
of this line this will be chosen as total fields. Fields to the left will be scattered 
fields. 

Clearly, we cannot simultaneously work with scattered and total fields in the 
update equations. However, because we know the incident field, we can add or 
subtract this as necessary. Let us consider the update equation for H,. Here, we 
will retain the full notation (including half-steps) to avoid confusion. For i < iz, 
we use Eq. (3.15), with all the fields scattered fields. For i > it, we use the same 
Eq. (3.15), but now all the fields are total fields. 
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On the zoning interface, i = iz, we have a total Hi(i + 5. jt 5) field, total 
Ey" fields, a total EY'(i + 1, j + 5) field and a scattered EY"(i, j + 7 n) field. 
We can make this last consistent by adding the known incident field E ve yG@, jt 


5s n). The update equation for i = iz becomes: 


1 1 1 
HS (int 55+ 5045) 
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For the Ey component located at x, = (L — 1), i.e. just to the left of the in- 
terface, all the fields in the update equations are scattered, except for the H; field 
located at x, = (L — 5)A x. The update equation for this becomes: 


1 
t/.- : 
Ee (11.3 + xn + 1) 


scat in, j ; tot . of . 
= E, (ust 5n)- Se [a Itai tants 


— Hy (u + oo + . n+ 5) — HS (u ~ a + _ + 5) (3.23) 
The update equation for the other component, FE, involves either only total 
fields (fori > iz) or only scattered fields (for i < iz) and hence can be used with- 
out change. 
As an example, if the incident field is a plane wave, propagating in the x- 
direction, in free space, with time history x(t), with a z-polarized magnetic field, 
the expressions for the incident fields are: 


E'® — x(t —tp,)$ (3.24) 
1 

H™ = —x(t —tp,)Z (3.25) 
n0 


no = / 0/€0 is the wave impedance of free space. tp is the delay time from some 
arbitrary start location. For the problem shown in Fig. 3.2, this could conveniently 
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be taken as x = 0. For the magnetic fields located at x, = (L — 5) Ax, the delay 
time istp, = (L— s)Ax/e; for the electric fields located at x7 = (L — 1)Ax, the 
delay time is tp, = (L — 1)Ax/c. In short, the half-delta difference in spatial po- 
sition of the fields must be taken into account. Note that these delay times are only 
valid for the specific case of a field propagating only in the <-direction. Formulas 
are easily derived for plane waves propagating in other directions, but the above is 
sufficient for now. 

Considering how simple it was to include the 1D source, one might wonder why 
this apparently much more complex approach is necessary in 2D. It is possible 
to include a simple line source in 2D in much the same way as in 1D, by simply 
specifying the value of the source at a particular point in the mesh. This however 
radiates cylindrical, not plane, waves; hence, this approach is not useful for most 
scattering problems. However, it is convenient for initial code testing, and also 
for checking the operation of absorbing boundary conditions. The next idea that 
springs to mind is simply to drive a line of points in the mesh with some source 
function. The problem with this is more subtle; suffice it to say for now that al- 
though this seems like a simple approach, it does not give good results in practice. 


3.2.4 Meshing the scatterer 


The process of generating a suitable FDTD grid for a problem is often called 
“meshing.” As already indicated, this can be a formidable problem in general. We 
will be using a very simple test problem — a circular cylinder.? This will allow us to 
make a very simple “mesher.” We will place the cylinder, radius a, at a convenient 
location in the mesh and then simply compute the distance to a point in the mesh; 
if this distance exceeds a, the point lies outside the cylinder, if it is less than or 
equal to a, it lies inside or on the surface. Since the E, and Ey field components 
are offset in space, we must do this for each component. As a first pass, we will 
make the cylinder highly conducting, indeed perfectly conducting, so that the (to- 
tal) fields inside the cylinder are zero. The appropriate boundary condition will be 
to zero the fields tangential to the cylinder. 

The above sounds very straightforward. It is only when coding that a whole 
number of problematic issues suddenly appear. The first is that we have spoken 
about “tangential” fields. With a round cylinder, the tangent will only lie in the +x 
or ty directions at four points (top, bottom, right and left in Fig. 3.2). Elsewhere, 
in all the other FDTD cells which the boundary of the cylinder passes through, 


3 “Cylinder” is the general mathematical description of any object generated by translating a two-dimensional 
cross-section along its normal. For instance, a “cylinder” may be square. (In normal English usage, a cylinder 
is round.) The full mathematical term for what is commonly called a cylinder is a “right circular cylinder.” 
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we are only going to be able to approximate the boundary, and because we use a 
rectangular grid, the resulting approximation is often called a “stair-step” approx- 
imation. 

This problem emerged because we are modelling round (or more generally, 
curvilinear) structures with a rectangular grid. But even if we only model rect- 
angular structures which can be aligned to the FDTD grid, another problem still 
remains. Refer back to Fig. 3.1. Now, instead of modelling a PEC (perfect elec- 
trical conductor) scatterer, let us rather model a cylinder made of some dielectric 
material, with permittivity ¢z. In the update equations, we need to specify the 
value for € = €réo. Assume we do this for E* (i + 1, 7 + 5). Now, what do we do 
with the two E* components located Ax /2 to the left of this interface? If we set 
€r for them as well, the interface has effectively been “moved” slightly to the left, 
and now we have the same problem with E” (i, j + 5) ... If we do not, the inter- 
face is then located somewhere between (i + 5)Ax and (i + 1)Ax. Again, this is 
a problem without a simple answer. Due to the half-step offsets in the FDTD Yee 
grid, there is an uncertainty about the precise position of material interfaces in the 
basic Yee algorithm. Since it is a maximum of a half-cell, and the cells are usually 
quite small, it is normally acceptable, but can be problematic. (“Averaging” meth- 
ods have been used successfully to correct this, and to improve the modelling of 
curvilinear structures, but we will not consider these at present.) 

One final issue still remains to be solved before we develop a 2D FDTD code 
for scattering off a cylinder: how do we terminate the mesh? The problem is 
the following: we want to simulate a free-space environment, which means that 
waves scattered off the target should radiate radially away to infinity, diminishing 
in strength and eventually disappearing. Clearly, we cannot make an FDTD grid 
sufficiently large to simulate this. If one has seen an anechoic chamber used for 
antenna measurements, one will know that antenna designers have a similar prob- 
lem; they have solved this by coating the walls of the anechoic chamber with an 
absorbing material. This, effectively, is what we will attempt to do now. 


3.2.5 Absorbing boundary conditions 


The field of absorbing boundary conditions (ABCs) attracted much research 
throughout the 1980s and early 1990s. Two methods have historically been pur- 
sued: radiation BCs and absorbing BCs. The term ABC is also used more gener- 
ally for both. The former modifies the FDTD update equations; the latter modifies 
the material properties in the mesh. 

Having really good ABCs, and here is meant ABCs with a reflection coefficient 
less than —60 or —70 dB, means that it is possible to bring the ABC close to 
the radiating/scattering structure, “wasting” as few Yee cells as possible meshing 
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up free space. Due to the great interest in the field, one will find a large number 
of references on the topic. Later in this chapter, we will introduce a revolutionary 
boundary condition, the perfectly matched layer, but for the time being, we will use 
a very simple ABC. The idea is the following, for a —x traveling wave on plane 
x = 0. It uses the concept of one-way wave equation, also known as the advective 
equation, with a wave solution f (x + ct), traveling only in the —* direction: 


E - 22 |e. =0 (3.26) 


(x,t) represents one of the components of the wave. This leads then to a 1D 
ABC, as follows. We impose this one-way wave equation on a wave incident on a 
surface normal to x: 


a 1d 
a, P@.1)) = - To, t) (3.27) 
Ox x=0 c ot x=0 
Applying forward differencing in x and t, one obtains: 
Ax 
nan n~ (antl _ an 3.28 
| 20 cAt ($o do) ( ) 
Finally, rewrite this to give the desired ABC: 
met i cAt cAt 
a l= = 3.29 
e 63 ( a area (3.29) 


This analysis must be repeated at the boundary x = Xmax. In this case, the rele- 
vant one-way wave equation, with solution in this case f(x — cf), traveling only 
in the +x direction, is 


a) loa 
ge tea [oe.n=0 3.30) 


Imposed on a wave incident on a surface normal to x, the wave is again “absorbed.” 
This leads then to the other 1D ABC: 


(3.31) 


lo 
a =a t) 


: (x,t) 
ay 


X=Xmax X=Xmax 


Applying backward differencing in x and forward differencing in t as before, one 
obtains: 


Ax 4] 
no an Rie fantl — an 32 
Oh, — Oh 1% OH = 4) (3.32) 


Finally, rewrite this to give the desired ABC: 


on = ON, (1 = S| + —on,- (3.33) 


3.2 The 2D FDTD algorithm 79 


Interestingly, the equation is identical in form to Eq. (3.29). The extension to +y 
propagating waves on the planes y = 0 and y = ymax is obvious. 

As noted, @ was used here; clearly, we need to apply this to the various tangential 
field components at each boundary. Note that we only need apply it to either E 
or H ; once we establish one of the fields “outside” the computational domain, 
the usual update equations, combined of course with the half-space step offset, 
establishes the other. 

Because this ABC used forward differencing, it is only accurate to first order. 
(Remember that the Yee scheme has second-order accuracy.) It is “exact” in 1D; 
in 2D and 3D, for paraxial incidence, reflection coefficients of 1 < — 25 dB may 
be obtained, but it degrades rapidly off-normal. Mur, in 1981, published a more 
complete first-order ABC, as well as a second-order one. Details are available 
in [1, Chapter 6]. These first- and second-order Mur ABCs are still widely used, 
owing to their simplicity and reasonable effectiveness; however, commercial codes 
should also offer perfectly matched layers. 

We now have all the tools needed to produce a 2D FDTD simulation of elec- 
tromagnetic scattering from a cylinder in free space — we already have suitable 
wideband pulses from our 1D work. We will now proceed to develop the simula- 
tor. 


3.2.6 Developing the simulator 


There are a number of issues to consider when turning this algorithm into code. 
Although we will not be excessively concerned with computational efficiency ini- 
tially, it is good practice nonetheless to consider some issues. Firstly, division is 
a much more expensive process in terms of computing time than multiplication. 
Equation (3.18) contains a term At/wAs, and Eqs. (3.19) and (3.20) both contain 
the term At/eAs. Usually, there will only be a few different material regions in 
an FDTD code. So, it would be better to store these as an array representing mate- 
rial properties, perform the division once before the time-stepping starts, and then 
simply use the relevant value of this array at each stage. One of these is needed per 
field component: 


Az(i, j,n) = AG, jn-Y+4+ Dae NLExGs+1,n) 


Ex (i, jin + 1) = Ex (i, j.n) + Cex (i, J) [Hz(i, jon + 1) —_ Ai, j ~_ l,n + 1)] 
(3.35) 


Ey, j,n +1) = Ey@, jn) — Cey@ JUG jn t+ 1)—- AG —-1,j,n+1)] 
(3.36) 
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with 
i At 
cE) a= AGG =A) a 
oe At 
Cey@ J) = p= TAs, p — 1/21A5) Ce) 
- At 
Dy = (3.39) 


(li — 1/2]As, [7 — 1/2] As) 


where the (x, y) coordinates at which ¢ and y are to be evaluated are explicitly 
indicated. The previous discussion in Section 3.2.4 regarding the exact position of 
material interfaces refers again. 


Coding hints — programming the update equations efficiently 


The obvious way of programming Eqs. (3.34)-(3.36) is to use a double-loop 
(a DO-loop in FORTRAN, a FOR-loop in many other languages, including 
MATLAB). However, with MATLAB, this is not a good idea. The problem is 
that MATLAB is an interpreted language, as opposed to a compiled one, and only 
runs efficiently when its (highly optimized) vector commands can be used by the 
interpreter. So, an update such as Eq. (3.36) is best programmed as in Fig. 3.3 — 
note that the ... is the MATLAB line continuation character. 


2 rane: 
y) - H_z n(1:N_x-1,2:N_y) ) 


:N_y) = E_y_nmini(2:N_x, 
2:N 


2:N_ 
y).*( Hz n(2:N_x,2:N 


Figure 3.3 MATLAB code stub for updating Ey. 


This looks somewhat cryptic on a first reading: the key operation 
is H.zn(2:N_x,2:N_y) - H_zn(1:N_x-1,2:N_y) which effectively 
shifts the second occurrence of the H_z_n array along its first dimension (cor- 
responding to x) and permits the difference to be formed as a vector operation. 
It is also clear why the indices must run from 2 to Nx, rather than from 1 to Ny 
(and similarly along the second dimension); otherwise, the operation would re- 
fer to non-existing array elements at 0 when shifted. These, the boundary values, 
must be computed separately. The .* operation in MATLAB denotes element- 
by-element multiplication (also sometimes known as the outer product of two 
matrices). 


A point to note when coding is that because the FDTD algorithm is explicit, 
the new values that we compute at a point are not affected by the new values 
at any other points. Hence, we do not need to take particular care at the line 
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% Update H fields: 


H_z n(1:N_x-1,1:N_y-1) = H_z_nminl(1:N_x-1,1:N_y-1) ... 

+ D(1:N_x-1,1:N_y-1).*( E_x_nminl(1:N_x-1,2:N_y) - E_x nminl(1:N_x-1,1:N_y-1) ... 

+ E_y_nmini(1:N_x-1,1:N_y-1) - E_y_nmini(2:N_x,1:N_y-1) ) ; 
% Special update on scat/tot field boundary 

E_y_nmini_inc = ones(1,N_y) *gaussder((m-1)*delta_t - (L-1)*delta_s/c,m_offset,sigma) ; 
H_z n(L,1:N_y-1) = H_z nmini(L,1:N y-1) ... 

+ D(L,1:N_y-1).*( E_x_nminl(L,2:N_y) - E_x_nmini(L,1:N_y-1) ... 

+ E_y_nminil(L,1:N_y-1) + E_y_nminl_inc(1:N_y-1) - E_y_nmini(L+1,1:N_y-1 


Figure 3.4 MATLAB code stub for updating H,. 


corresponding to scattered/total field interface i = L. We can update values at this 
point as usual with a vector operation, and then overwrite them with the correct 
values. (Obviously, the values of H,, for instance, must be correct before we start 
the updates of the electric fields, and vice versa.) Although this involves a small 
amount of unnecessary computation — in this case, we compute the values along 
the line separating the scattered/total field twice — the savings in code complexity 
are so significant that this is almost universal practice in FDTD codes. In the code 
stub shown in Fig. 3.4, we show the update for the H field, demonstrating this 
idea. The semicolons at the end of each line prevent the results being written to the 
command window, which is essential with the large datasets which the FDTD can 
easily generate. gaussder is a function which returns a suitable differentiated 
Gaussian. 

With the 1D FDTD, the algorithm is simple enough that it is relatively easy 
to program correctly. However, our 2D FDTD simulator is already sufficiently 
complex that to try to program it in its entirety in one go is likely to lead to great 
frustration. There are no less than three major, different types of errors that can be 
made. How to test the code systematically, and locate likely errors, will now be 
discussed. 


Coding hints — frequently made errors in MATLAB 


MATLAB is an excellent environment for quickly testing and demonstrating algo- 
rithms. However, from the viewpoint of programming, it has a number of “fea- 
tures” which would be seen as deficiencies in most programming languages. 
The most prominent of these is that it is not a strictly typed language — indeed, 
MATLAB has many properties of a scripting language. This means that variables 
do not need to be declared before they are used. The advantage is convenience; 
the drawback is reliability. Firstly, one can accidentally overwrite an existing 
variable; in particular, i and j offer suffer this fate. A variant of this is that a 
subtle spelling error creates a different (and usually undefined) variable. Some 
other errors frequently made in MATLAB, in particular by programmers used to 
other languages, include: 
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Indices in for loops The correct format for the for loop indices is for 
ii=1:N_x, for example. FORTRAN programmers in particular are inclined to 
code this as for ii=1,N_x, which is incorrect in MATLAB. 


Testing equality versus assignment The correct logical expression to test if ii is 
equal to jj is if ii == jj (asin C). Again, FORTRAN programmers often 
code thisas if ii = 44, which assigns the value of jj to ii. 


Both these errors are especially frustrating to locate; MATLAB executes the 
former incorrectly (or at least incorrectly in terms of the programmer’s ex- 
pectations), and earlier versions also executed the latter (later versions issue a 
warning). 


Implementing the update equations 


The easiest mistakes to make here are with the indices. In particular, the repeti- 
tiveness of FDTD equations encourages cutting-and-pasting, and one has be very 
careful to correct all the indices (and also field subscripts) when doing this. A sim- 
ple test which can be used is to note that an FDTD update equation involving (say) 
the x component of a field on the right-hand side never involves a partial derivative 
(which is of course a difference equation in the code) in x (i.e. the first index). For 
instance, look at the term in the update for H, (Fig. 3.4): 


E_x nminl(1:N_x-1,2:N_y) - E_x_nmini(1:N_x-1,1:N_y-1) 


Clearly, the following would be incorrect: 


E_x nminl(2:N_x,2:N_y) - E_x nmini(1:N_x-1,2:N_y) % THIS IS WRONG! 


It is essential to check the update equations by very carefully reading through 
each one as programmed. 

To check that the update equations are working, a very simple source at one 
point can be used. Physically, this represents an infinitely long line source. Instead 
of the full scattered/total field approach shown in Fig. 3.4, the code in Fig. 3.5 
injects a source of cylindrical waves in the center of the mesh. Again, note that 
the source update at (N,/2, Ny /2) simply overwrites the just updated value. Note 
also that the EF field update equations in this case are simply those of free space. 
Also combined with this, the outer boundaries at this stage can simply be set as 
PECs by zeroing the relevant tangential electric field components; see Fig. 3.6 for 
an example. 
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% Update H fields: 


H_z n(1:N_x-1,1:N_y-1) = H_z_nminl(1:N_x-1,1:N_y-1) ... 
+ D(1:N_x-1,1:N_y-1).*( E_x_nminl(1:N_x-1,2:N_y) - E_x_nmini(1:N_x-1,1:N_y-1) 
+ E_y_nmini(1:N_x-1,1:N_y-1) - E_y_nminli(2:N_x,1:N_y-1) ) ; 
% Drive a test line source - used to check basic operation 
H_z n(N_x/2,N_y/2) = gaussder((m-1)*delta_t,m_offset,sigma) ; 


Figure 3.5 MATLAB code stub for updating H,, using a point (line) source. 


% Fix outer values of E tangential as PEC: 
Lis a): 


Figure 3.6 MATLAB code stub for setting PEC boundaries. 


Implementing the plane-wave source 


Once one has confidence that the update equations are working, one can proceed to 
test the full scattered/total field formulation, incorporating the plane-wave source. 
Now, one needs to start thinking about the electromagnetics of the problem. In the 
1D case, we simplified matters by using a set of equations with the speed of light 
set to 1 m/s. Now, we are working with the real world, and c ¥ 3 x 108 m/s. Since 
we are primarily interested in radio-frequency (RF) problems, we will select an RF 
source, with Gaussian derivative shape, with frequency content in the gigahertz 
range. It turns out to be convenient to select a signal with o © 1 x 107!°; this 
produces a signal with peak spectral amplitude at about 1.5 GHz; reference to 
Fig. 2.164 shows that at around twice the frequency of peak spectral amplitude, 
the spectrum has decayed to around 30% of the peak value. In the present case, this 
is 3 GHz; the wavelength in free space is 10 cm (0.1 m) and now we have some 
guidelines to setting As: we should make this around 1/10 of the wavelength at 
3 GHz, viz. As = 0.01 m. (Note that we must be careful to work in SI units!) At 
will be set by the Courant limit (a maximum of 23.587 ps, when using the exact 
value for c). 

For testing the code, it is tempting to set N, and Ny quite small, for instance, 
5 or 10. Whilst this is occasionally necessary when something is really wrong 
and one is having to step through the code, it is actually a bad idea in general. 
The reason is that the absorbing boundary conditions are not included yet, and the 
temporary PEC boundaries suggested above result of course in the wave reflecting 
back. With small domains, these reflections mean that it is not possible to observe 
the field develop and propagate properly. A good test uses N, = 200 and Ny = 
100 (corresponding physically with As = 0.01 m to an area 0.2 x 0.1 m7). The 
scattered/total field zone is placed at L = 50. 


4 That plot was normalized to 0 = 1; the extension is obvious. 
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Figure 3.7 Gaussian derivative pulse used for 2D FDTD simulation. 


The Gaussian derivative pulse defined in Eq. (2.71) was obtained by differenti- 
ating Eq. (2.68), and has inconvenient amplitude behavior (being proportional to 
1/o7). The following pulse has a far more convenient, almost normalized ampli- 
tude: 


—4 (t = m) eo tm)" /20? 


Van - (3.40) 


vo(t) = 


Its time history for o = 1 x 107!°, and with m = 4c, is shown in Fig. 3.7. The 
peak amplitude is 0.9670, at 0.3322 ns. 


Coding hints — a normalized Gaussian derivative pulse 


The following equation defines a properly normalized Gaussian derivative pulse: 


1/2 
vo(t) = ——— (t — m) e720? (3.41) 
Oo 


The normalizing constant e!/*/o provides a unit peak amplitude at t — m = +o. 
Since the results in this chapter do not require this, the signal in Eq. (3.40) is used 
in the following discussion. 
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Figure 3.8 Gaussian derivative pulse at a point just to the right of the scattered/total field 
zone. 


This can now be injected into the scattered/total field code. We will monitor 
the H, field (scaled by no, to give a peak value close to unity). We will do this 
at point 1, with indices (L + 1 = 51; 50), just to the right of the scattered/total 
field interface, and at point 2, with indices (101; 50). The result, for M = 400 
time-steps, in shown in Fig. 3.8. The first peak value is at 1.9577 ns, the second, at 
3.6559 ns. Now, we establish whether this checks with basic physics. The time dif- 
ference between the peaks of these pulses is 1.6982 ns. In free space, it should take 
1.6678 ns propagating at the speed of light to cover the distance of 50Ax = 0.5 m. 
This is a difference of around 1.8%. This is very probably due to numerical dis- 
persion. To confirm this, the problem should be rerun, using a finer mesh. If this is 
done with As reduced by half to 0.005 m, the time difference reduces to 1.6746 ns, 
corresponding to an error of around 0.41%, and confirming that numerical disper- 
sion was indeed the cause of the problem. 

The above results demonstrate a working code. If, however, one is not this for- 
tunate, where does one look for the errors? The first thing to do is to ensure that 
the source really is working correctly. In MATLAB, the source was implemented as 
a function, in a separate m-file. This allows one to write a short test routine to see 
what the signal looks like. If this is correct, then the likely errors are in the scat- 
tered/total field equations. Be especially careful to ensure that the half-step offsets 
are correctly taken care of in both space and time. 
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Figure 3.9 Gaussian derivative pulse and reflection. 


Implementation of the ABC 


Now that we have a code with working update equations, and can inject a plane 
wave into it, the PEC boundaries must be replaced with ABCs. An implementation 
note in passing: in order to test the ABCs, it is sufficient initially, using the plane- 
wave source in the previous section, to implement only the ABC at N,, keeping 
PEC ABCs at the other boundaries. This permits one to get one set of ABCs work- 
ing first. 

Monitoring the signal at a location mid-way between the zone interface ati = L 
and the right-hand boundary ati = N,, the signal shown in Fig. 3.9 is recorded for 
M = 600. “Zooming-in” on the reflection, we see Fig. 3.10. The first (negative) 
peak, with a value of around —0.03 V/m, corresponds to the reflection of the first 
(positive) peak, which was around 0.8 (see Fig. 3.9), so the reflection coefficient 
of the ABC is around —30 dB.> 


3.2.7 FDTD analysis of TE scattering from a PEC cylinder 


Now that the basic FDTD code is working, we are in position to study TE, scat- 
tering from a PEC cylinder. Again, we will work in the microwave region. A 


5 One could compute this more accurately but all that is needed at present is a “ball-park” figure. The correct 
method for numerically evaluating reflection off ABCs is to run two simulations, one using a reference solution 
computed on a very large grid, and the other a much smaller grid using the ABC. The reflection is then computed 
by subtracting the reference solution from the ABC-corrupted solution. We will do this when we evaluate the 
PML ABC later in this chapter. 
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Figure 3.10 An enlargement of the signal in the region of the reflection. 


convenient dimension will be a radius of a = 0.03 m. As a first pass, we will 
choose a rectangular domain, 2m x 1 m (we will see shortly why we chose this). 
We will choose As = 0.005 m; this will allow a moderate approximation of the 
curvature of the cylinder. Even so, this means that the stair-step approximation of 
the cylinder will be quite crude — across the diameter of the cylinder there are only 
six cells — and we should bear this in mind when interpreting the results we will 
generate. 

The simplest method of introducing the cylinder into the mesh is by simply 
zeroing the relevant C (i, j) coefficients, see Eqs. (3.37) and (3.38). This ensures 
that the relevant electric fields inside and on the surface of the cylinder are zero. 
It is tempting to do the same with the magnetic fields; this however is incorrect, 
since it effectively also forces the tangential magnetic fields to zero at the cylinder’s 
“surface,” which is not the correct boundary condition. 

We want to compute the echo width of the target, usually abbreviated o,,. It is 
defined as follows: 

Eseat2 


Oy = lim 27 
l>oo 


jem? (3.42) 
In an FDTD simulation, some finite limit on £ is essential. The conventional 3D 
criterion for establishing the onset of the far-field, viz. £ > 2D? /A, where D is 
the largest dimension of the target, D = 2a in this case, can be used. If we set 
£ * 1m, the minimum wavelength (and hence maximum frequency) at which this 
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still satisfies the far-field criterion is around 7 mm, or over 40 GHz, so this is more 
than adequate for our purposes. 

We also now appreciate how convenient the scattered/total field formulation is; 
we can immediately obtain the scattered field by placing our sample point in the 
scattered field zone. Here, we have the following considerations: we would like to 
be as far away from the cylinder as possible, but since the reflected signal can be 
expected to be quite small, we should also be far away enough from the left-hand 
wall that we can “gate out” unwanted reflections — remember that our ABC is far 
from perfect. Since we are only going to look at back-scattered fields, we can place 
the scatterer to the right in our grid. 

With these considerations in mind, then, we make the following choices. 


e Locate the cylinder at x = 1.5m, y = 0.5 m. 

e Place the scattered/total field boundary at x = 1 m. 

e Record the scattered field at x = 0.5 m, y = 0.5 m (i.e. 1 m away from the target, and 
0.5 m away from the closest walls). 


We will see shortly (Fig. 3.13) that TE, back-scattering from a PEC cylinder 
increases rapidly with frequency up to a first resonance. This occurs when ka ~ 
0.8, which for our cylinder with a ~ 0.03 corresponds to a frequency of just over 
1 GHz. We also want to be able to capture the next resonances, so we need a signal 
with significant frequency content in this region. A differentiated Gaussian pulse 
with o = 5.0 x 107!! has a spectrum peaking at around 3.2 GHz, which will be 
adequate here. A longer pulse would work from the viewpoint of spectral content, 
but this shorter pulse is convenient for another reason we will see shortly. 

Finally, we note that Eq. (3.42) is a frequency domain expression. The Fourier 
transforms of both the scattered and the incident fields must be computed, and 
divided pointwise.° Note also that this expression, being a power ratio, requires 
squaring the magnitude of the resultant transforms. (The phase information is ir- 
relevant here.) 

The back-scattered signal computed with the FDTD, with grid and problem as 
set up above, is shown in Fig. 3.11. Although we can go ahead and transform this, 
we should note that the main signal lies in the region 8—11 ns; the signal at 12 ns is 
almost certainly an unwanted reflection of some type. Similarly, the small “glitch” 
at 6 ns is also very likely to be some form of computational artifact. It is usual 
practice to remove these by “windowing” — although quite sophisticated windows 
exist, here it is sufficient simply to zero the signal outside this window. This is 
shown in Fig. 3.12. Finally, the echo width is plotted in Fig. 3.13, normalized 
by za and compared to results computed using an exact eigenfunction solution 
[2, Figs. 12-34]. (The frequency axis is also normalized; this is usual practice with 


6 In MATLAB, the . / operation. 
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Figure 3.11 Back-scattered signal from the PEC cylinder. Medium mesh, As = 0.0025 m, 
e=5% 10-4 
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Figure 3.12 Windowed back-scattered signal from the PEC cylinder. Medium mesh, As = 
0.0025 m,o = 5 x 1071. 
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Figure 3.13 Normalized echo width for the PEC cylinder: FDTD results and eigenfunction 
solution. Medium mesh, As = 0.0025 m,o =5 x 1071!. 


canonical shapes such as cylinders. k is the free-space wavenumber, and a the 
cylinder radius.) 

The results in Fig. 3.13 show reasonable agreement at the first resonance, but 
the comparison is quite poor for the next resonances. To improve this, we first 
need to understand the physics of the scattering process. The first peak is simply 
energy which reflects directly off the cylinder, back in the direction of propaga- 
tion. (This is the reflection which asymptotic methods, such as geometrical optics, 
would compute.) The next peak is due to energy which attaches itself to the top 
(bottom) of the cylinder, and “creeps” around the shadowed side of the cylinder 
before detaching itself from the bottom (top). Clearly, this signal travels a longer 
distance than the direct reflection; depending on the cylinder’s size, it may rein- 
force the direct reflection or partially cancel it. This then accounts for the peak at 
around ka * 2. The extra distance traveled isa + ma +a = (2+ 7)a; this signal 
travels at the speed of light which should result in a delay of about 514 ps. If we 
inspect Fig. 3.12, we can see these two signals; the (negative) peak of the direct 
reflection is at around 8.3 ns and the (same) peak of the creeping wave is at around 
8.8 ns, i.e. around 500 ps apart. On this figure, there is another rather smaller sig- 
nal, approximately 800 ps later; this is the creeping wave which has gone right 
around the cylinder for a second time. It travels an extra 27a, a slightly longer 
distance. 
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Figure 3.14 Windowed back-scattered signal from the PEC cylinder, comparing the 
medium and fine mesh solutions. For both solutions, ¢ =5 x 107!!. 


The problem is that the approximation of the round cylinder with the FDTD 
stair-step approximation is inadequate at higher frequencies; we need to refine the 
mesh. Time domain results comparing a finer mesh (with As = 0.00125) with 
the medium mesh above are shown in Fig. 3.14; note the better pulse shape in the 
finer mesh case. Results for both this finer mesh and for a coarser mesh (As = 
0.005, and using a longer signal with o = 1 x 107!) are shown in Fig. 3.15. 
The eigenfunction data have been interpolated to make it easier to compare the 
respective results. The agreement is satisfactory, bearing in mind that although we 
satisfy the far-field criterion, ideally one should be a much larger distance from the 
scatterer. It is also clear that the solution will require an even finer mesh to get good 
agreement at the higher frequencies. We also note, perhaps surprisingly, that the 
coarse mesh solution appears to give a more accurate solution for the amplitude of 
the first resonance. However, we should bear in mind that we used a longer pulse, 
with lower spectral content, for this solution; the other meshes used pulses with 
peak spectra somewhat higher than this. 


3.2.8 Computational aspects 


One last aspect we should at least get an appreciation of before finishing this in- 
troduction to the two-dimensional FDTD is the question of the amount of com- 
putation required — and also the amount of computer storage needed. The most 
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Figure 3.15 Normalized echo width for the PEC cylinder showing three different FDTD 
results compared with the eigenfunction solution. 
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computation is required in Eqs. (3.18)—(3.20), because each of these needs to be 
updated at all N, x Ny points at each of M time-steps. All other operations, such 
as boundary conditions, sources via the scattered/total field interface etc., involve 
only either N, or Ny points. Counting the number of operations, we see that to up- 
date the H, field at each point requires five floating-point operations (+, x), usu- 
ally abbreviated as flops. (The shift operations on the field components are ignored 
in this type of count. The reason is that efficient computer languages recognize this 
type of operation and perform the shift indirectly by an offset in memory access.) 
The EF, and Ey field updates each require three flops. Thus, in total, the number of 
operations required per time step is approximately 11 x N, x Ny flops. The over- 
all number of operations is thus 11 x Ny, x Ny x M flops. If we also keep track of 
the run-time (which MATLAB allows us to do, the cput ime command being one 
way of doing this, or one can simply use a stopwatch), the speed of the computer 
for floating point operations can be computed — often known as the floprate, and 
given as megaflops,’ a million floating-point operations per second, or gigaflops 
(10° flops per second). Some very fast supercomputers are specified in terms of 


7 The results in this book were originally prepared largely on an IBM A31 notebook computer in 2003. The 
machine had a Pentium® 4, 1.8 GHz, 512 MB RAM. According to this test, the computer produced around 
11.7 megaflops, which is quite slow; the clock speed on its own of course says little about especially floating- 
point speed. However, it is quite possible that an implementation in FORTRAN or C would be much faster; 
factors of two orders of magnitude are not unusual when converting code which does not readily vectorize in 
MATLAB to FORTRAN etc. In the present context, one would expect a less dramatic speed-up, given the vector 
nature of the update equations as coded. 
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teraflops (10! flops per second) or even petaflops (10!> flops per second). Now it 
can be appreciated that halving the mesh size will increase the run time by a factor 
of 2° = 8; to put this into practical terms, a run which took perhaps some minutes 
with one mesh may take an hour or so with a mesh twice as fine. 

The analysis just performed leads to a field of computational science known as 
complexity analysis. What is of interest is the asymptotic computational cost of the 
algorithm. For CEM algorithms, this is usually performed on a square region, di- 
mension d per side, with Ax = Ay = As (or in 3D, a cube), thus Ny = Ny = N; 
furthermore, we note that the number of time-steps M is also essentially propor- 
tional to N. Hence the run-time is proportional to N? for the 2D FDTD algorithm; 
alternatively, we describe this as an O(N)? algorithm. 

This analysis in terms of number of unknowns is correct. Since N is inversely 
proportional to As, which in turn is often assumed to be inversely proportional 
to frequency f (via rules of thumb such as As < 1/Amin), the 2D FDTD al- 
gorithm is also often viewed as O(f)> or equivalently, noting that kmaxd is the 
size of the region in wavelengths,® O(kmaxd)*. This, however, is optimistic. The 
problem is that the assumption that As, and hence N, is directly proportional 
to Amin is incorrect as the electromagnetic size of the problem increases. The 
reason is numerical dispersion in the FDTD grid. As an example, a phase er- 
ror of 5% over a region of one wavelength results in around an 18° cumulative 
error, probably acceptable; the same percentage error over a region ten wave- 
lengths in length will produce a cumulative error of 180°, clearly unacceptable. 
The dispersion error can be reduced by using a finer mesh. A more realistic as- 
sumption is that N o (kmaxd)!>; hence the 2D FDTD algorithm has an asymp- 
totic complexity of O(kmaxd yO (Kmaxd)*>, depending on whether the number 
of time steps is assumed proportional to N or kmaxd. (One will find both in the 
literature.) 

Regarding storage, the 2D FDTD does not make especially heavy demands on 
modern computers. The amount of storage required is the following, per cell: 


e three field components — times two, for past and present; 
e three material constants. 


There are Ny x Ny cells, so the total storage required is 9N, x Ny. In MATLAB, 
each real number is stored in double precision, requiring 8 bytes (most conven- 
tional languages, such as C and FORTRAN, permit the user to choose single or 
double precision). The storage in bytes is 72N, x Ny. Of course, there are some 
other variables to store as well, but these are generally the largest. The fine mesh 
solution of the PEC discussed in the previous section required around 92 Mbytes 
of storage. 


8 Since kmax = 27 /)min- 
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3.3 The PML absorbing boundary condition 
3.3.1 An historical perspective 


By the early 1990s, the FDTD method had become very popular. However, the 
problem of terminating the mesh remained problematic. As we have seen, simple 
ABCs such as the first-order one already outlined only provide —20 to —30 dB 
of absorption, and then only close to normal incidence; whilst there were already 
better ones available, they were non-trivial to implement, and battled to provide 
more than —50 dB or so. By comparison, good anechoic chambers were able to 
provide 70 dB or more of dynamic range. Most of the work on ABCs had concen- 
trated on analytical ABCs, using the properties of the wave operators. However, 
another type of absorber had also been experimented with — perhaps inspired by 
the pyramidal absorbers used in anechoic chambers. This was the use of absorbing 
material at the periphery of the mesh. As we will shortly see, a material with both 
electric and magnetic loss (carefully chosen in the correct ratios) can provide a per- 
fect match, but only at normal incidence. The advantage of this is that the update 
equations do not need to be modified. Early efforts had achieved some success, but 
only worked well near normal incidence. 

In 1994, Berenger published a truly seminal paper? [3]. His idea, like most 
really good ones, was in essence quite simple. He noted that the problem with arti- 
ficial absorbers was their inability to operate over a wide range of incidence angles, 
and proposed that the solution was to increase the degrees of freedom available to 
provide the match. He proposed a method to do this in two dimensions, by “split- 
ting” one of the field components in two — in the case of the TE, problem we have 
investigated, it is H, which is thus treated, viz. H, = H-, + Hzy — and assigning 
different electric and magnetic loss to each component. Despite the initially worri- 
some nature of the split field, he showed that the result was what he called a per- 
fectly matched layer (PML) which, in theory at least, absorbed incident waves of 
all polarizations, at all frequencies, and at all angles of incidence. Furthermore, the 
wave transmitted into the PML had the same wave speed as the incident wave, the 
same characteristic impedance, but attenuated (potentially rapidly) in the normal 
direction. All that was needed to implement the absorber was to modify the FDTD 
update equations in the PML region to accommodate the split field. Perhaps even 
more incredibly, “corner regions” of a mesh, which had long caused problems, 
could be treated by simply overlapping an x-attenuating and a y-attenuating PML. 

This almost appeared to good to be true in 1994; within an extremely short time, 
the entire FDTD community identified the crucial importance of Berenger’s work, 


2 In retrospect, ideas in CEM can often be attributed to several independent inventors, but his invention was 
unique and certainly deserving of the subsequent accolades. It is interesting that he appears to have published 
nothing on the FDTD in English language journals prior to this, although he had worked on ABCs before, 
publishing in French. 
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validated it independently, and quickly extended it to three dimensions. Further- 
more, two different approaches were quickly introduced to avoid the split field 
formulation, whilst retaining the superb performance of the PML. The one ap- 
proach used “stretched coordinates,’ and was independently introduced by Chew 
and Weedon [4] and Rappaport [5]; the other used an anisotropic medium with uni- 
axial permittivity and permeability tensors, and was introduced by Sacks et al. [6]; 
the latter approach is generally known as the UPML formulation (Uniaxial PML). 
The stretched coordinate formulation is rather mathematical in nature, but is very 
useful for other coordinate systems; the UPML, due to its physical plausibility 
(usually described as Maxwellian), is probably the most popular contemporary 
approach. Note that even the UPML material is nonetheless fictitious; however, 
Ziolkowski has investigated the physical realizabilty of such material [7]. 

In this chapter, we are going to use Berenger’s original split field formulation. 
The reason is that it is both the simplest and also the most efficient approach in two 
dimensions. Using the UPML, for instance, requires introducing the electric and 
magnetic flux vectors, D and B, which doubles the amount of storage required in 
the UPML region, whereas the split field formulation requires only one extra field 
component. Additionally, using the UPML requires that we deal with dispersive 
materials: although this is not too difficult to implement, it is additional complex- 
ity we choose to avoid now. It is important to note that this benefit accrues only 
in two dimensions; in three dimensions, there is little to choose between the for- 
mulations from the viewpoint of efficiency, since all fields must then be split in 
the Berenger approach. Furthermore, dispersive materials with the specific form 
required by the UPML can be quite efficiently handled by the FDTD. Should the 
reader want to undertake a three-dimensional implementation, a detailed discus- 
sion of the UPML approach is available in [8, Chapter 7] and would be the present 
author’s recommendation. 


3.3.2 A numerical absorber — pre-Berenger 


Before discussing Berenger’s contribution, we will review the case of a normally 
matched numerical absorber. Our presentation is based on Gedney and Taflove’s 
approach [1, Chapter 7] and we very largely use their notation here. Firstly, we 
consider the case of a TE, wave Hine — = Hye~/Pix*+Biyy)2 incident on a half- 
space interface with an absorber at x = 0. Importantly, the (fictitious) absorber 
has both electrical (0) and magnetic (o*) loss. The fields on the incident (x < 0) 
side, region 1, are the usual free-space fields: 


A, = Hot F ef2hixy ei ixxt Biyy) 


mat 


E, ee |-2 Biy ae a T ef*Pixy¢ oe a-re#n3] Ho e J BixxtBiyy) 
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The fields on the transmitted (x > 0) side are: 


Hb = Hot ed (Boxx+Boyy) 


E> = - Boy x + Box 5] Hot e J Baxxt+Bayy) 


wen(1 + —~) wen(1 + ) 


Here, I’ and t are the usual plane-wave reflection and transmission coefficients 

at the interface. These equations follow simply from the Maxwell equations, if a 

(fictitious) magnetic current and hence loss term are included in Faraday’s law; 

they are generalizations of the case discussed by Balanis in [9, Section 5.4.2]. 
Similarly, the dispersion relationships are: 


7 oe 


Bix = ki cos 6;, Biy = ki sin6;, Vx <0 


Box =e (1 + 7) (1 + — ) a (B2y)*, Vx >0 (3.43) 
JMe2 JoOL2 


with kj = w./ejuj, i = (1, 2). 


Enforcing continuity of the tangential fields at the interface, x = 0, one obtains: 


Bix es Box 
r= WE] we2(1+a/jwe2) 
= Bix + Box 
WE] weé2(1+a/j wer) 
tT=H=14Tr 


Boy = Biy = ky sin 6; 
For normal incidence (6; = 0), this simplifies to: 


a 
ni + 2 


, fa 121 + o*/jour) 
i el +o/jou2) 


Now, the core idea: set [42 = [41, €2 = €, and further, enforce 


with 


o o es 2 
— =— >o0* =on1/€) = o(1) (3.44) 
Mi €1 


Then, kj = k2, 1 = 2, and thus we obtain perfect absorption: = 0. Also, very 
importantly, 


oO 
rose = (1 + -_) kj =k, — jon, (3.45) 


3.3 The PML absorbing boundary condition 97 


and the transmitted fields in region 2 are 


=; 


Ex = n, Hoe 


Hy = Ho e FRx pom x3 


Hak gem 


In summary, note the following important features of this solution. 


e At normal incidence, there is no reflection at the interface: hence (at this angle at least) 
we have a perfectly matched layer (PML). 

e The transmitted wave in the absorber has the same velocity as in region 1, but attenuates 

in the normal direction. 

Although lossy, the absorbing material is dispersionless (that is, the wave speed is inde- 

pendent of frequency). 


3.3.3 Berenger’s split field PML formulation 


The previous fictitious absorber exhibits PML behavior only at normal incidence; 
its properties degrade rapidly off-normal. As discussed in the introductory com- 
ments, Berenger recognized that an additional degree of freedom would permit a 
match off-normal as well. He did this by “splitting” the transverse fields into two 
orthogonal components, for example H, = Hz, + Hzy in his notation. Associated 
with these were two components! of o* (a* and o,) and similarly, nvo compo- 
nents of o (0, and oy). 

Applying this to our previous two-dimensional TE problem, instead of the usual 
three equations in E,, Ey and H, — for example, as in Eqs. (3.5)—(3.7) — we now 


have four: 
0(H. Hy 
eae (1 eee )z ae, Ot Ey) (3.46) 
jMe2 dy 
0(H. el 
joe (1 + Je ae (3.47) 
joer Ox 
: o; OE, 
jop2 | 1+ —— | Hy, =-=— (3.48) 
Jop2 ax 
ay dE 
jour (1 an =) Hy = — (3.49) 
Jop2 dy 
Introducing the variables 
Sk = (1 + o¢/joer), sg = (1+ of /jou2) k=x,y (3.50) 


10 Th retrospect, this was the crucial idea, and the split field simply a mathematical device to accomplish this: 
clearly this defines an anisotropic medium of some type. 
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it may be shown that: 


Hee Hore NOTIN Bay (3.51) 
Bo sy _j * _j * 

Ey = —Hot— | Se IVP IV) By (3.52) 
wer Vy Sy 


We? 


7 7 * 7 * 
Ey = Hot | Skies V 8x57 B2xX JF n/SySy Boyy (3.53) 
Sx 


with 
(Box)? + (Boy)? = (2)? (3.54) 


Clearly, these can be discretized using the central-differenced leapfrog Yee ap- 
proach. 

The phase-matching condition at the interface requires that the propagation con- 
stants in the y-direction are identical; this can be achieved if SySy = |, or equiv- 
alently oy = oy = 0. Thus, £2y = Biy = k; sin 6;. Further, the H-field reflection 


coefficient may be shown to be: 

Bis — fas, [8 
WE] wer VY Sx 
ee 
Now, let €1 = €2, “1 = 42, and sy = s¥. This is equivalent to ky = k2, n| = 


Vii/e1 = JV H2/€2 = 12 and ox/e, = of /41. Thus, from Eq. (3.54), Bix = Box, 
and from Eq. (3.55), = 0. The resultant TE, field transmitted into the PML is 
then: 


re Paver (3.55) 


H, = Ho eo IB'*x—JBly g—FxM1 COS Hix (3.56) 


and similar expressions for Ey and E,. 

These have the same behavior as the previous normal-only PML, but attenuate 
without dispersion for all incident angles. 

These results are so important that we will highlight them again in summary 
form. 


Theoretically, the PML absorbs incident waves of all polarizations, at all frequencies, 
and at all angles of incidence. 

Further, the wave transmitted into the PML has the same wave speed as the incident 
wave, the same characteristic impedance, but attenuates (potentially rapidly) in the nor- 


mal distance. 
All that is needed to implement the absorber is to modify the FDTD update equations in 
the PML region. (Again, in retrospect what is required is the ability to handle a certain 
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type of lossy anistropic material; this at heart is why the update equations need to be 
modified.) 

Although perhaps not immediately clear from the above, a “corner region” of a mesh 
can be treated by simply overlapping an x-attenuating and a y-attenuating PML. This 
had long been a very troublesome problem with analytical ABCs. 


We have already discussed the alacrity with which Berenger’s idea was adopted 
in the FDTD community; within a few months, the PML had been extended to 
three dimensions by Katz, Thiele and Taflove [10]; Berenger himself also extended 
his formulation to three dimensions [11]. 


3.3.4 The FDTD update equations for a PML 


With the theoretical background in place, we turn our attention to implementing 
and then testing a split field PML. The time domain equivalents of Eqs. (3.46)- 
(3.49) are 


(eo; Ge a.) Boe ae (3.57) 
(a7 + a.) Py Sa (3.58) 
(uas +02) Hoy = (3.59) 
(uo se a: A = (3.60) 


Compared to Eqs. (3.5)-(3.7), the loss terms bring a slight complication: we 
require the value of the electric field, for instance, at a half time-step, e.g. 
Ey(@ + 7 Jat 5)s a point at which it is not available. (Note that this prob- 
lem is due to the presence of loss, and not specifically because of the PML — 
even a normal material with finite electrical conductivity presents this problem.) A 
method widely used with success is the “semi-implicit”!! approximation: the re- 
quired value is computed as the arithmetic average of the previous (known) value 
and the as-yet-to-be-computed value, i.e. 


ExG+5,jnt+1)+ Ex +3, in) 
2 


E,G+3,j0+5)= (3.61) 


!1 The FDTD method is an explicit method; “future” values are computed entirely from “present” and “past” 
ones. The approach discussed here uses a “future” value as unknown in the update equation, albeit itself, and 
hence the name “semi-implicit.” 
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Using this approximation, and otherwise proceeding as before, the result is the 
following set of update equations: 


Hx (i, jin) —= Dapex @, J) . Hx (i, jn _ 1) a Dox. @, J) . 


[Ey@ +1, j,n) — Ey@, j,n)] (3.62) 
Hey(i, jn) = Daya (is P)- Hey, j.2 — V+ Doyo D> 
[Ex(i, 7+ 1,n) — Exi, jn] (3.63) 
ExG, jn +1) = Cap, G, f)- Ex, j.n) + Con, i, J) 
[42 j,n +1) -— Az, j —1,n+1)] (3.64) 
Ey, jn +1) = Caps, f)- EyG, i,m) — Copy Gf) 
[H,G, j,n+1)—-AzG-1,j,n+)] (3.65) 


where we have combined!” the H field in Eqs. (3.64) and (3.65): 
A-(i, IM= Az (i, j,n) + Azy(i, j, 1) (3.66) 


and the material constants are defined as 
oy (i, j)At 
fe ee 260 (i, 7) 
Cap, G, j= 7 oi, {JAP (3.67) 
2€2(i, j) 
At 


ete ea, jyAy 
Coe, (i, I) = — Sapa (3.68) 
Tt eG 
= ox (i,j) At 
i, eo) 
Cagy (i, j)D= = AE (3.69) 
+ eG7) 


At 


by ea FAR 
Cory, J) = Ta olinjr (3.70) 
Tt eG.) 
1 of GDAt 
gd TTD 
Day. @, J) = RO aM (3.71) 
+ BG 


At 
cd iid, jyA 
Done (i, D = 1 eM (3.72) 
+ 3G) 


12 This is slightly more convenient to code. However, note that the split fields must be retained, and updated as 
usual before the next iteration. 
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, o*(i,j)At 
aHzy b D) = eG DAE (3.73) 
2u2(1,J) 
At 


- Mai, j)Ay 
Dox. i, J) = SC (3.74) 


2u2(0,J) 


As usual with an FDTD equation set, there are subtle differences between the 
otherwise very repetitive equations which one must be careful to code correctly. In 
particular, note that o, is associated with the E update (and vice versa), whereas 
oy and o¥ are associated with the H-, and H-y updates respectively. 


3.3.5 PML implementation issues 


One issue which one needs to decide upon when implementing an FDTD PML 
code is whether the PML update equations are going to be used throughout the 
entire computational domain, or whether different code will be written for each 
section. (By simply setting the conductivities to zero, the PML reduces to the 
usual update equations; alternatively, the electrical conductivity may be retained 
if required, etc.) The former has the advantage of being far simpler — and cor- 
ner regions are very simply catered for automatically — but it does increase the 
memory requirement. The latter is far more tedious to code and the potential for 
coding error is much higher, but it is more memory efficient. In 2D, the overhead 
is only 33% in the non-PML regions, and since 2D FDTD codes are in any case 
not especially memory intensive, it is almost certainly better to use the PML up- 
date equations throughout. In 3D, however, the overhead is 100% in the non-PML 
regions, and the decision is not quite so straightforward. Bear in mind though that 
the PML works so well that the absorbing boundary can be brought quite close to 
the scatterer, reducing the memory required in any case. 

Remember also that the exact positions of the material parameters are implied 
but not explicitly stated in the Eqs. (3.67)-(3.74); for example, in Eqs. (3.67) 
and (3.68), oy(i, 7) must be evaluated at ([i — s1Ax, [j — 1]JA,), the position 
of the relevant E, field component; similarly, in Eqs. (3.69) and (3.70), ox (i, j) 
must be evaluated at ({i — 1JAy, [j — 5]Ay); and in Egs. (3.71)-(3.74), of (i, j) 
and oy (i, j) must be evaluated at ([i — s]Ax, lj - 5]Ay). (Note that H,, and 
H-y are located at the same grid point, the usual H, location.) This implies of 
course that o,, o,, and the pair of; a, are always evaluated a half-grid point 
apart. Since the usual polynomial scaling results in quite rapidly changing con- 
ductivities, this is an important point to bear in mind for a high-performance 
PML. 
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Theoretically, the PML can be made as thin as desired by simply making the 
material extremely lossy. In practice, the FDTD discretization, with the accompa- 
nying half-cell offset, produces some “numerical” reflection. To ameliorate this, 
practical PML schemes use a number of FDTD cells to implement the absorber, 
with a “graded” loss profile, increasing from zero loss at the PML/free space in- 
terface to some maximum value at the boundary of the grid. A widely used pro- 
file is polynomial grading; for a PML of thickness d, the value of 0, at depth 
x is 


Ox = (4d) Ox max (3.75) 


where oy max is the maximum value attained at x = d. Typical practical PMLs are 
five to ten FDTD cells thick, with a polynomial order loss profile from two to four. 

When discretized in an FDTD mesh, the discretization error produces a filtering 
effect, which produces some frequency dependence — typically low frequencies are 
not absorbed as well as higher frequencies. 

Thus far, nothing has been said about suitable values for o. An extensive series 
of numerical experiments has demonstrated that an optimal choice of this parame- 
ter for polynomial grading is 

Ox,max = ee Vien!) (3.76) 
nAs 

Usually, the external walls are treated as PECs for simplicity, i.e. the relevant 
tangential field is set to zero. 

When implementing a PML, one needs to think carefully about the slight lack 
of symmetry in FDTD grids. As an example, consider oy in the layer of the cells 
with, on the one hand, j = 1 and on the other, j = Ny. Setting the tangential fields 
(Ex) to zero, the result is that in the layer of cells with j = Ny, there is no field, 
since the relevant E,, field component is “below” the last cell (in the geometry of 
Fig. 3.1). Thus, the value of oy in cell layer 7 = 1 actually corresponds to that in 
cell layer j = Ny — 1, rather than 7 = Ny. Also, once oy has been computed, it is 
tempting to find oy using on = Noy, but as we have already commented above, 
this is subtly incorrect due to the As/2 offset between electric and magnetic field 
points. 


Coding hints — testing a PML 


The first test to run with a PML is a free-space test: set all the conductivities 
to zero, which effectively reduces the PML to free space. Errors in the update 
equations will often quickly make themselves apparent without having to worry 
about whether conductivity profiles have been set correctly, for instance. 
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Figure 3.16 Normalized reflection from a split field PML. 


3.3.6 Results for a split field PML 


The PML performs so well as an absorber that trying to identify the reflection 
visually, as we did with the first-order ABC earlier, is impossible. The correct 
approach to testing a PML (or indeed any ABC) is to run two simulations, with 
identical discretization and source: one with the ABC under test, and another with 
a rather larger computational space. The signal is then compared at a point near 
the ABC. In this case, a 200 x 200 simulation was compared with a 400 x 400 
simulation. The two signals cannot be distinguished on a graph, so on Fig. 3.16, 
the difference between the signals is shown — this is the reflection. Note the vertical 
scale. This has also been normalized by the signal peak, and further time-gated to 
remove double reflections, etc. When expressed in dBs in Fig. 3.17, the results 
are deeply impressive: the five cell thick, third-order polynomial grading PML has 
a maximum reflection of around —65 dB; the ten cell thick PML improves this 
to —85 dB. 

Prior to the Berenger PML, the best ABCs were challenged to produce reflection 
coefficients significantly less than around —50 dB. As we have seen, the Berenger 
PML offers astounding performance — broadband reflection coeffecients far less 
than this are easily achieved, and with care (for example, optimized conductivity 
profile, double precision), absorptions of the order of —100 dB and significantly 
less have been obtained. The FDTD is in a position to out-perform very careful 
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Figure 3.17 Normalized reflection from a split field PML, in dB. 


measurements; as mentioned earlier, sophisticated anechoic chambers have dy- 
namic ranges of around 70 dB but there is little prospect of dramatic improvements 
there. 


3.3.7 Drawbacks of the Berenger PML 


It may seem curmudgeonly to offer any criticism at all of such an innovation, 
but despite its superb performance, the PML has some drawbacks, especially in 
three dimensions. For 3D formulations, the PML requires that all field compo- 
nents be split, doubling the memory requirements in the absorbing region; with a 
5-10 cell thick layer in 3D this can become a significant overhead. (Other for- 
mulations, such as the UPML, do not split the field but instead require the D 
and B fields to be stored as well in each cell, so the overhead is the same.) The 
Berenger PML is non-Maxwellian; the field splitting is a mathematical artifact 
which works very well but leaves niggling questions about physical reality. These 
drawbacks led to the investigation of other equivalent formulations, aiming to re- 
produce the superb performance of the PML with a (potentially) physical real- 
izable material. This is also important for applications in FEM codes, where the 
splitfield formalism has no counterpart. Two approaches have emerged: the uniax- 
ial anisotropic absorber and the stretched coordinate formulations. Although our 
implementation is the original split field one, we will briefly outline these other 
approaches. 
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3.3.8 Uniaxial absorber theory 


A uniaxial material has the following tensor characterization: 

c 0 0 
0], H=pur1|0 d 0 (3.77) 
b 0 0d 


with D =@E and B = aH . It has been shown that if the tensors are chosen as 
follows: 


see Or 0 
€=€{5, = MWS, s=|O0O 5, O 
0 0 sx 


then a plane wave is completely transmitted (i.e. [ = 0), independent of angle, 
frequency and polarization — a uniaxial PML (UPML). 
The identity with Berenger’s PML is reinforced with the choice: 


Ox 


Ss, =1t+ 


; (3.78) 
Jwe 

Note that this material is dispersive. This UPML and Berenger’s split field PML 

have been shown to have the same propagation characteristics. The associated 

Gauss’ laws are different (but irrelevant in an FDTD code, which discretizes only 

Ampeére’s and Faraday’s laws). 

The UPML can be discretized relatively simply in an FDTD fashion; the best 
source here is [1, Chapter 7]. However, instead of split fields, the D and B vectors 
must also be stored and updated in the PML region. As mentioned, the material 
is dispersive; fortunately, there are some elegant approaches available to deal with 
this [1]. 


3.3.9 Stretched coordinate theory 


Another formulation shown to be equivalent is the “stretched coordinate” theory. 
The Cartesian coordinates (x, y, z) are mapped into complex space using 


x 
x= / Sx (x') dx! (3.79) 
0 
and similarly y and z. Partial derivatives then become: 
0 1 a 
= S 3.80 
OX Sy Ox ( ) 


and these are carried into the Maxwell equations. Stretched coordinates have been 
useful in extending the PML to cylindrical and spherical coordinate systems. 
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3.3.10 Further reading on PMLs 


An excellent description by Gedney and Taflove may be found in [1, Chapter 7]. (A 
very similar treatment is also available in the slightly earlier [12, Chapter 5].) The 
treatment presented here is based on this approach. Gedney and Taflove well sum- 
marize the fervor with which the FDTD community adopted, expanded and gener- 
alized Berenger’s work, and provide an extremely useful unified view of the origi- 
nal split field formulation, the UMPL and the stretched coordinate viewpoints, with 
a consistent notation. The original paper by Berenger remains interesting reading 
[3]. There are a very large number of papers on the subject of PML and the FDTD; 
the interested reader is referred to the extensive list of references in [1, Chapter 7]. 
One paper which is worth highlighting is Wittwer and Ziolkowski’s contribution 
[13], since this discusses a number of practical issues in PML implementation. 


3.3.11 Conclusions on the PML 


Berenger’s PML (and the related UPML) came close to putting the ABC “indus- 
try” out of business, at least in the FDTD community. Using the Berenger PML, 
a numerical absorber for the FDTD with essentially arbitrarily good performance 
can be produced. This has been extended to terminating conductive and/or disper- 
sive regions, as well as half-spaces [1, Chapter 7]. There are still some detail issues 
to consider — although the basic formulation has been done, details for the PML in 
other coordinate systems are not always readily available. 

The PML has some computational overhead and does complicate a code to some 
extent, whether one uses the split field, UPML or stretched coordinate formula- 
tions. 

It should be commented that such superb absorption is not always required, and 
a simple ABC is sometimes sufficient, especially if combined with time-gating. 

A final comment: the issue of high-performance numerical absorbers in FEM 
codes is not such a closed topic; UPML in an FEM mesh can wreck matrix condi- 
tioning and radically slow iterative solvers to the point of uselessness. With time 
domain FEM, the dispersive nature of the UPML is especially problematic. 


3.4 The 3D FDTD algorithm 


Extending the two-dimensional algorithm to three dimensions is straightforward 
from the viewpoint of the update equations. However, there is no analogy to the 
TM and TE modes, and all six field components must be updated. The field com- 
ponents are located on the full Yee cell. Again, the field components are offset in 
both space and time. Details are available in a number of texts. A good introduc- 
tion is available in [2, Chapter 11]. For a very comprehensive study of the FDTD 
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method, including state-of-the art material, refer to [1]. We will not discuss the 3D 
FDTD algorithm further here, except to note the greatly increased computational 
cost associated with adding another dimension. The algorithm is now O(N)’, or 
Olkniedy -Olknmdy. Halving the mesh size increases the run time by a fac- 
tor of 16, doubling the frequency, by between 32 and 45 or so (when numerical 
dispersion is correctly controlled as discussed previously). 

In 3D, memory also starts becoming a serious issue; the storage requirements for 
the six field components (times two, for past and present) and the material arrays 
(in double precision) become 144N, x Ny x Nz bytes. A computational volume 
with 100 cells on a side will require 144 MB. This will run on most contemporary 
personal computers (depending obviously on the amount of memory installed), 
but just doubling this to 200 cells increases the memory requirement to well over 
1 Gbyte. This is well within the scope of most workstations, but beyond most PCs 
at the time of writing. Double precision is unnecessary for most applications, and 
one can save storage by storing an integer index rather than the material arrays as 
done here, but even so, the storage requirement grows very rapidly. 

It is for these reasons that the development of efficient ABCs was so crucial 
as the enabling technology which permitted widespread adoption of the FDTD. 
Highly efficient ABCs permit one to place the scatterer very close to the bound- 
ary, and one can also obtain scattered fields very close to the boundary without 
unphysical reflections corrupting the fields. 

We will not discuss the three-dimensional FDTD further, but rather turn now to 
the use of a commercial code which implements the FDTD. 


3.5 Commercial implementations 


Perhaps the most well-known commercial implementations of the FDTD are 


XFDTD and CST MICROWAVE STUDIO™™ (MWS). The former is an imple- 
mentation of the standard FDTD. The latter is actually a suite of codes, including 
a transient solver which uses the finite integration technique (FIT) [14, 15]; its 
predecessor was known as MAFIA and one may still encounter reference to this 
in the literature. Although apparently based on an integral equation approach to 
the Maxwell equations, for Cartesian grids the FIT can be rewritten as a standard 
FDTD method, and in the following we will use the term FDTD when discussing 
MWS. 

It is worth commenting here that the FDTD method also sometimes uses finite 
integration methods, in particular for deriving subcellular models. The idea is the 
following. Referring back to Fig. 3.1, instead of writing the Maxwell equations in 
differential form, we will write them in integral form in this Yee cell. (As before, 
we will restrict ourselves to the TE, mode here.) Specifically, we write Faraday’s 
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Law on contour C, the boundary of the Yee cell: 


> > 0 > > 
f é-di= = ff uH -dS (3.81) 
C Ot JJ A 


Approximating the E, and Ey components by their values at the Yee locations as 
in Fig. 3.1, and approximating H, by its value in the center of the cell, one obtains: 


0 : ; : ; : 

ZMH + 3, i+ DAxAY = Ext 9, Ax + EyG +1, j + pAy 
—E,G+4,j+)Ax — EyG, j + dAy 

(3.82) 


Dividing by the area Ax Ay, and using the usual finite difference approximation in 
time for ZH,, we obtain Yee’s FDTD algorithm. 

This form is especially useful when one wants to model fine geometrical features 
which are rather smaller than the Yee cell in the rest of the model, since the field 
behavior can be taken into account when performing the integral. (As a simple 
example, the quasi-static 1/r nature of the magnetic field near a thin wire is used 
to incorporate thin wires.) These are generally known as local subcell models. !° 
Typical examples include thin sheets, better approximations of curved boundaries, 
thin wires, and thin cracks. 


3.5.1 An introductory example — a waveguide “through” 


The following is the first use of a commercial code in this book — in this case, 
MWS - and we will use this to highlight some important points about using an 
unfamiliar simulation tool. 

Firstly, most packages nowadays ship with good documentation, usually with 
some form of “Getting Started” manual, or some variant on this theme, and time 
spent working through this type of manual is time very well spent indeed. Most 
simulators have some features and functions which are not immediately obvious, 
even if one is familiar with the method implemented, and the introductory manuals 
will often highlight these and save much time and subsequent frustration. 

Secondly, even with the very best user interfaces — and MWS has a very impres- 
sive one — modelling complex three-dimensional geometries is not straightforward. 
One needs to try out simpler structures first, before attempting to model some com- 
plex device, quite possibly of unknown performance. Although MWS is at heart an 
FDTD code, the mesh is very largely invisible to the user. Model creation proceeds 


13 Another term often used is partially filled cells. Subcell is also sometimes used to describe submeshing, a 
method whereby a cell is divided into a number of smaller cells to improve accuracy. 
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Figure 3.18 An MWS simulation of an empty piece of waveguide, showing an extremely 
low reflection coefficient as expected. 


by defining geometrical primitives, which are then combined into more complex 
structures, before finally adding electrical parameters such as ports, field monitors 
etc. (One exception in MWS is the electrical and/or magnetic properties of mate- 
rials, which are defined as needed during model building; in some other packages, 
this is only done once the geometrical model is finished.) 

So, with the notes of caution in mind, before analyzing a real device, the first 
structure which we will simulate is an empty piece of waveguide. We will do this at 
X band (8.2—12.4 GHz), using a piece 40 mm long. (This is long enough to test the 
model without requiring a significant run-time.) In MWS, we create the waveguide 
using either of the pre-defined waveguide “templates.” (Templates simplify gener- 
ating particular types of frequently used models; in the case of a waveguide, for 
instance, the exterior region is set to PEC.) Then, the “brick” primitive is used to 
generate the length of waveguide (the standard cross-section inside dimensions are 
22.86 mm x 10.16 mm). Finally, the “pick face” function is used twice, to assign 
waveguide ports to each end of the length of waveguide. Since the waveguide is 
empty, the magnitude of the transmission coefficient should be unity, and the re- 
flection coefficient zero. A result is shown in Fig. 3.18.!4 The reflection coefficient 


16 Throughout this book, results have generally been plotted from MATLAB, using data computed by the relevant 
program, so as to provide some visual unity. Most programs provide a command to export data to some type 
of neutral file format. 
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Figure 3.19 The waveguide filter geometry, showing the metallic septa (not to scale). 


is less than —100 dB across the band, showing excellent performance, and giving 
confidence in basic modelling and simulation setup. 


3.5.2 A waveguide filter 


With some confidence that one has basic modelling skills with a particular pack- 
age, one can turn to more interesting and challenging problems. Again, we will use 
a waveguide example, but now a more complex double-pole filter. The following 
example was originally designed by Meyer and van der Walt [16]. This X-band 
waveguide filter consists of three metal septa along its center, normal to the broad 
walls of the waveguide. The smaller septa are each 6.556 mm in length, and the 
longer is 16.788 mm. The inter-septa spacing is 12.148 mm. The septa are 0.2 mm 
thick. See Fig. 3.19 for a sketch of the filter. 

When dealing with waveguide discontinuities, one of the first things one must 
note is that only the dominant waveguide mode should be present at the ports. In 
this case, an extra section of empty guide, 23.9 mm, was added, but any similar 
value would be acceptable. (The evanescent modes dampen exponentially, and at 
10 GHz, the guide wavelength is around 40 mm, so the above length is around 
one-half a guide wavelength, more than sufficient.) 

The modelling process in MWS is very similar to that already discussed in 
the previous introductory example, although here the “waveguide filter’ template 
is chosen. (This sets some internal analysis parameters which are optimized for 
highly resonant structures.) The septa inside the waveguide are added quite easily 
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Figure 3.20 An MWS simulation of the waveguide filter in the text. 


using the “brick” primitive. (There are various ways of doing this; using the “work- 
ing coordinate system” — a local coordinate system which can be easily reposi- 
tioned — which the package supports can simplify this.) 

In this case, the initial results were not especially accurate. The reason is that a 
filter relies on resonances and anti-resonances for its operation, and these must be 
computed extremely accurately for good overall accuracy. MWS offers an adap- 
tive mesh facility, which automatically refines the mesh in regions it determines. 
Using this option provides a much more accurate result in this case. In Fig. 3.20, 
three results are shown: MWS pass 1 is the result after one solution; MWS pass 4 
is the result after four adaptive passes have been undertaken; and the FEM results 
were computed using FEMFEKO, an experimental FEM program that will be de- 
scribed in Chapter 10, using complete second-order vector elements.!° Clearly, the 
FEM results and pass 4 are in excellent agreement. For this filter, measured data 
are also available; the measured center frequency was 10.47 GHz. This is an ex- 
ample of the difference we have already discussed between the approximate field 
problem (which these two different techniques have solved with great accuracy, 
the difference in center frequency being less than 0.1%) and the actual problem 
(both analyses differ from the measured result by about 2%); the difference is very 
likely due to manufacturing tolerances. 


!5 More details on the FEM simulation may be found in Section 10.9; this solution had an average edge length 
of 3.0 mm, with 4968 tetrahedral elements and 41 526 degrees of freedom. 
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3.5.3 A microstrip patch antenna 


FDTD codes can also be applied to antennas, provided a suitable ABC is avail- 
able. MWS offers a PML-based ABC; as we have seen, this is a very accurate 
mesh truncation technique. An “antenna on planar substrate” template is available, 
although for an accurate model we will have to work a little harder. One important 
point which one must bear in mind is that with the FDTD, the substrate will not 
be of infinite extent, unless we use a suitable boundary condition to simulate this. 
This is different to the simulations we will discuss in Chapter 8, which use a form 
of the method of moments which includes stratified media in the formulation. 

The particular patch we will analyze is discussed in some detail in Section 8.2; 
here, we will only give dimensions. It is 31.18 mm x 46.75 mm in size, on a sub- 
strate 2.87 mm thick with €, = 2.2. The patch is fed via a pin (diameter 1.3 mm), 
offset by 8.9 mm from the center of the long edge, to provide a match close to 
50 Q. 

In MWS, there are two ways to simulate such an antenna. The first uses a “‘dis- 
crete port.” This is an approximation of a real feeding region, and implements ei- 
ther a voltage, current or “S-parameter’” source (the last being a current source with 
internal impedance, which is needed when computing S-parameters). It amounts 
to forcing a field value at a point (or points) in the mesh. Since it is not a particu- 
larly accurate model of a physical source, there will be limitations on the accuracy 
expected, but it is fast to model and also more rapid to compute. If using a dis- 
crete port, the model is almost trivial to build: one defines the substrate using, 
once again, the “brick” primitive, then adds the patch, defines the discrete port 
at the appropriate offset location and runs the simulation. The only point which 
can cause some delay, in particular for users used to MoM codes, is that all struc- 
tures in MWS have finite thickness — MoM codes usually work with infinitely thin 
metallic sheets. For the patch, a typical metalization thickness would be 25 jum, al- 
though the value is really not critical. MWS uses an elegant subcell model, known 
as the perfect boundary approximation [17] so that thin metal sheets do not have 
to comprise a full FDTD cell. 

A more accurate model of the patch uses a coaxial feed and waveguide port. 
One way to do this is to add explicitly a ground plane of PEC of finite thickness, 
in which the coaxial feed will be embedded. (For reasons of internal code opera- 
tion, MWS recommends that the length of the coaxial feed should be several times 
the thickness of the substrate; in this case, a length of 10 mm was chosen.) When 
adding the coaxial feed pin, one needs to be careful, since one is adding struc- 
tures in regions where material already exists. In this case, it is easiest first to add 
the outer dielectric coaxial region cutting through the ground plane (and to use 
the same dielectric filler as the substrate material) and then to add the PEC inner 
conductor, which extends to become the feed pin. Although in general different 
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Figure 3.21 An MWS simulation of a microstrip patch antenna. FEKO results are also 
shown for comparison. 


materials cannot be defined in the same geometrical region,!© MWS permits PECs 
and dielectrics to coexist, but the region is effectively treated as perfectly conduct- 
ing. 

An alternative approach is to use a thin ground plane, and construct a coaxial 
cable on the reverse side. 

Results for two such models are compared to a FEKO computation in Fig. 3.21. 
There are actually three MWS results in the plot. Model one used the discrete port 
approach, and a 100 mm x 100 mm substrate, using open boundaries on the sub- 
strate sides, an open boundary with additional space above the patch, and an elec- 
tric boundary on the ground plane. Model two used the same substrate and bound- 
ary treatment, but a full coaxial feed model.'’ Model three used the same coaxial 
feed model and boundary treatment, but with a smaller substrate, 5|0 mm x 50 mm 
in size; the results are very similar to those of model two, indicating that the open 
boundary is simulating an infinite substrate quite well. All models were also run 
through the adaptive meshing process. The agreement between all three models 
and the FEKO computation is good; the discrete port model indicates the least 


16 One makes use of various Boolean operations to combine, intersect, etc. such overlapping regions to resolve 
this. 
!7 The outer diameter of the coaxial feed, i.e. the region penetrating the ground plane, was chosen to give 


Zo = 50 &. For a coaxial cable of course, Zy = Se In(b/a) with b and a the outer radius and inner radius 
Tr. 


respectively. In this case, with the same dielectric constant as the substrate, the outer radius was 2.24 mm. 
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good match, but the results are still quite usable. The difference in center frequen- 
cies between all four analyses is less than 1%. As we will see frequently in this 
book, this is a commonly encountered limit in CEM for resonant antenna mod- 
els, unless tremendous care is taken with the model. (It is worth commenting that 
manufacturing and material tolerances will often render this moot in any case.) 


Modelling hints — open boundaries and MWS 


MWS has two types of open boundaries, both simulated using the PML, and 
the difference between them is subtle. Although we did not discuss this, PMLs 
can also terminate a region with two different materials; see, for example, [1, 
Section 7.10]. An open boundary in MWS places a PML at the plane indicated, 
permitting the code effectively to continue the substrate indefinitely. An open 
boundary (add space) does much the same, but adds some additional (free) space 
first; hence, this will not produce an infinite substrate. 


Modelling hints — parametric modelling 


Many CEM codes now permit one to “parametrize” the model. This means that 
instead of entering an actual length as the model is constructed, one instead 
defines this as a parameter which can then be changed subsequently. (We will see 
extensive use of this type of feature in Chapter 5.) MWS offers this capability, 
although we did not use it in these examples. 


3.6 Further reading 


The FDTD literature is truly massive, and a search on any of the electrical engi- 
neering databases will produce more hits than one will be able to process. One’s 
first reference should be Taflove and Hagness [1], which provides encylopedic cov- 
erage of most aspects of the FDTD. The Schneider-Schlager FDTD database (see 
Appendix F) is also a very valuable resource. 

We have only touched the surface of the modelling possibilities of the FDTD 
method. There are a whole number of issues which one can still address. Here 
follows just a selection of these. 


e Our 2D example already indicated that the rectangular cells of the standard FDTD 
method may not approximate curved geometries very well. Methods of improving fine 
geometrical detail are generally known as “subcell” models, and usually rely on an 
equivalent formulation of the FDTD in terms of Faraday’s and Ampére’s laws, as briefly 
introduced in Section 3.5. See [1, Chapter 10] for more on this topic. Thin wires are 
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another type of structure which do not fit into the Yee grid very well. Bingle, the present 
author and Cloete describe a formulation incorporating finitely conducting wires in [18]. 
When dealing with wideband pulses, one should appreciate that many materials cannot 
be represented accurately by a fixed value of er. Again, elegant methods have been 
developed for dealing with materials with frequency-dependent material parameters; this 
is discussed in detail in [1, Chapter 9]. 

For larger scatterers, it is extremely inefficient to try to position a field point in the far 
field. Formulations are available to compute the far field from a near field time domain 
computation, which permits one to use a much smaller mesh. See [1, Chapter 8] for 
details. 

Non-linear problems can only be addressed using time domain methods. A considerable 
amount of work has been done using the FDTD for such materials, including work at 
optical frequencies. FDTD codes have also been hybridized with circuit simulators to 
include non-linear devices (e.g. diodes). [1, Chapters 9 and 15] addresses these issues. 
We have discussed one-, two- and three-dimensional formulations of the FDTD. There is 
another interesting formulation, suitable for rotationally symmetric structures: the body 
of revolution FDTD. (This has been described as a two-and-a-half dimensional formu- 
lation; the full three-dimensional fields are computed, but using a two-dimensional grid 
for each Fourier mode present — for some problems, only one such mode is needed.) A 
discussion of this may be found in [1, Chapter 12]. The present author and Ziolkowski 
also used this formulation for modelling optical wave phenomena; in [19], we presented 
the formulation. Rather importantly, the correct numerical stability criterion (the Courant 
limit) for this case is also given in this paper. 

The FDTD can also be used for handling periodic structures. The present author, Smith 
and van Tonder used this for modelling frequency selective surfaces [20]. The treatment 
by Maloney and Kesler [1, Chapter 13] provides an up-to-date account of the formula- 
tions available in this context. 

Another type of boundary condition of interest is the complementary operator. Ramahi 
has worked extensively on this, and a summary may be found in [1, Chapter 6]. Work 
also continues on other types of ABCs for the FDTD; see, for instance, [21]. 

A recently (re-)discovered algorithm, the alternating direction implicit (ADI) formula- 
tion of the FDTD method, permits one to exceed the Courant limit, but retain stability. 
The ADI-FDTD method does pose some other challenges [22]. 


3.7 Conclusions 


Our treatment of the FDTD method, which started out in the previous chapter 
with a very simple 1D transmission line problem, solved essentially in the fre- 
quency domain, continued in this chapter with a quite sophisticated 2D simula- 
tion, incorporating wideband pulses, absorbing boundary conditions, and a physi- 
cal analysis of scattering in the resonance regime!® in both the time and frequency 


18 The region in which the dimension(s) of the scatterer are on the order of several wavelengths at most. 
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domains, and finished with some examples computed using the commercial pack- 
age MWS. 

We have also looked at computational issues, both run-time and memory, which 
impact on our ability to perform useful FDTD simulations. Berenger’s PML has 
been introduced, and its extraordinary performance demonstrated. The 3D FDTD 
was briefly outlined; theoretically, there are no new issues to understand, but in 
practice writing a 3D code is challenging, since it needs to be very efficient in or- 
der to handle realistic problems (in 2D, far less optimal code can still be useful). 
Furthermore, for good results one should ideally use some of the more advanced 
FDTD approaches, in particular subcellular models and better modelling of curved 
boundaries. Unless one is fortunate enough to have access to an existing 3D FDTD 
code, such codes are generally best left to experts unless one has a very specific ap- 
plication in mind. The commercial code we discussed, MWS, provides a powerful 
implementation of the FDTD, offering (amongst other advanced modelling fea- 
tures) thin sheets, and a method called “perfect boundary approximation” which is 
essentially a type of subcell formulation improving geometrical modelling. It also 
features a user interface which at the time of writing was state-of-the-art. Other 
commercial FDTD codes are also available. 

The FDTD has truly become the workhorse of CEM computation over the last 
decade — even when it is not necessarily the best technique to use! In the next 
chapter, we introduce the method of moments, which is a very powerful method 
for dealing with highly conducting structures, and often more efficient for these 
applications than the FDTD method. 
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4 


A one-dimensional introduction to the method 
of moments: thin-wire modelling 


4.1 Introduction 


The method of moments — MoM — was one of the first numerical methods to 
achieve widespread acceptance in electronic engineering for the analysis of an- 
tennas and scatterers. It is generally defined as a method for reducing an integro- 
differential equation to a set of linear equations. The origins of the method are old; 
as was already indicated in Chapter 1, some of the early work was done over a 
century ago. One of the widely used integral equation formulations still used for 
the analysis of thin wires (that due to Pocklington) was first presented in 1897 
(although he used a series expansion method, rather than the modern segmenta- 
tion approach). The first publications in the antenna and propagation professional 
literature were in the early 1960s, and some of the canonical papers (those of 
Harrington, Richmond, Mei and Andreasen) appeared at much the same time 
as Yee’s paper. The specific name “method of moments” was introduced by 
Harrington in his early work, and the name caught on quickly; this was perhaps 
unfortunate, since the name has a slightly different meaning in contemporary ap- 
plied mathematics. In that field, and also fields such as computational mechan- 
ics, the term “method of weighted residuals” is generally used for what has be- 
come known as the MoM in radio-frequency engineering. Another term widely 
used in other fields of engineering is “boundary element method”; for highly 
conducting structures, this term and the MoM as used in electromagnetics are 
synonymous.! 

Primarily for two reasons, the MoM rapidly achieved widespread acceptance. 
Firstly, to a generation of engineers and scientists trained on analytical methods, 
] 2. 


along the lines of Harrington’s classic text on time-harmonic fields [1] “ — which in 


turn was based on methods of mathematical physics, as expounded by Stratton [2] 


! We return to the topic of nomenclature in the penultimate section of this chapter. 
2 Originally published in 1961, but reprinted since. 
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and Morse and Feshbach [3] — the method was clearly based on sound electromag- 
netic theory and more generally, methods of mathematical physics, in particular 
variational calculus (which was then in widespread use). Secondly, because the 
method discretized only the metallic wires or surfaces of the antennas, it was far 
more efficient than methods such as the FDTD for analyzing the relatively small — 
typically resonance regime — antenna structures which were then the main topic 
of research. (As we have seen, the FDTD requires the discretization of all space 
surrounding the antenna or scatterer.) Furthermore, many problems then of current 
research interest could be solved used the MoM in a reasonable time — this was far 
less true of the FDTD, whose requirements for memory and computer time could 
generally not be accommodated on 1960s era computers. 

In this chapter, we will present an introduction to the MoM, starting with an ex- 
tremely simple electrostatic example. Again, as with the FDTD, the simple physics 
and geometry permit us to illustrate a number of core ideas without becoming 
overwhelmed by implementation details. Following this, we will extend the discus- 
sion to electrodynamics. Thin-wire modelling uses locally one-dimensional basis 
functions, but for general wire geometries, one must of course take the full three- 
dimensional geometry into account, and hence writing one’s own MoM program 
for any reasonably interesting engineering problem is well beyond the scope of an 
introductory book of this nature. Fortunately, there are some excellent commercial 
implementations of the MoM, as well as one very useful public domain code; these 
are the topics of Chapter 5. 


4.2 An electrostatic example 


The problem we will address as an illustration of the MoM is the charge distri- 
bution p(z) on a perfectly conducting straight thin wire, of radius p = a, charged 
to a potential V volts relative to ground. It is based on an example presented in 
[4, Chapter 12]. The wire could, for instance, be charged by induction. It is im- 
portant to note that this is the opposite of typical work in introductory courses in 
electromagnetics, where o(z) is given and one must then establish the potential 
(and hence field). Given o(z), V(7) (and hence E= —VV) is easily found: 


Vos / pe) ay’ (4.1) 
v R@,r’) 


with 
RG, r) = |F -1" 


= Je =) rey 9) ae a) 
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The primed coordinates (r’ (x’, y’, z’)) are those of the source point. The field point 
coordinates are F(x, y, Z). 

However, our problem now is, given the voltage on the wire, to establish how 
the charge distributes itself. (In passing, we note that it cannot be a uniform distri- 
bution; the charges near the ends would clearly experience an unbalanced electro- 
static force which would push them towards the ends of the wire.) This falls into 
the general class of inversion problems, and cannot generally be solved in closed 
form, i.e. analytically. A numerical approach is the only general solution method 
for such problems. 

Before we proceed further, some terminology: Eq. (4.1) is known as an integral 
equation; the part inside the integral operator is frequently called the kernel. The 
function V(F) is the forcing function. Two other concepts that are central to this 
theory is that the physical environment surrounding the radiator/scatterer (in this 
case, free space, i.e. an infinitely large and empty vacuum) and the boundary con- 
ditions are all included in the formulation. This is what permits the MoM to solve 
typical antenna problems (at least those involving perfect or highly conducting 
conductors) very efficiently. We will later encounter Green functions; it is these 
that effectively take the environment surrounding the structure into account, but 
they are only available for a very limited number of environments. 

The critical idea is that Eq. (4.1) is valid everywhere — including on the wire 
itself, where V(x, y, z) is known. This is the boundary condition (BC) for the 
problem. The idea that we will pursue to solve this problem is to approximate the 
charge by a number of simple functions, of unknown amplitude, which we will 
then find by assembling a matrix equation representing the geometry of the model 
and the BCs in discrete form. 


4.2.1 Some simplifying approximations 


Before we proceed further with the MoM solution of this problem, we will 
make a number of assumptions, which will considerably simplify the solution 
process. 


e Equation (4.1) contains a volumetric integral. If we assume that the wire is a perfect 
electrical conductor (PEC), the charge is restricted to the surface and becomes a surface 
charge ps(z, 0 = a, d). (Note that we use cylindrical coordinates here, and that ¢ refers 
both to the radius in this coordinate system and to the charge. The meaning will be clear 
from the context.) 

Secondly, we will simplify the geometry, by assuming a Z-directed wire. 

Thirdly, we will assume that the charge distribution is uniform in the circumferen- 
tial direction, i.e. we can simply write ps(z, 0 = a, d) = ps(z, p = a). This permits us 
to approximate further the surface charge ps(z, o = a) by an equivalent line charge, 
p(z) = 27aps(z, 0 = a), placed on the Z-axis. 
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Using these approximations, the integral equation Eq. (4.1) becomes: 


a = 1 pe(z’) ! 
VG Pa) = Arey [ R(z, a Oo 


with (a is the wire radius): 


RG, 2) = Vie — "2 + — PF 2? 
=¥@+(@-27/)/ 


Note that we now write V(z, p = a), rather than V(r), since V is restricted to 
be on the wire surface (where the boundary is) and is rotationally invariant by 
assumption. 


4.2.2 Approximating the charge 


Up to this point, the approximations have been in the mathematical formulation 
(the integral equation). Now, we introduce the MoM as a method of approximately 
solving this equation. The wire, of length L, is broken up into N segments, using 
N + 1 nodes, defined as follows: 


PaGe Ae. St. WA (4.3) 
Ache (4.4) 
== 


In the following, “segment n” will mean the segment located between z, and Zy+1. 
The charge is approximated as 


N 
pz!) ® Y~ anhn(z’) (4.5) 

n=1 
Here, a, are unknown (but constant) coefficients, and h,(z’) are basis functions — 
also often known as expansion functions. (Many texts use f,(z’), but we want to 
reserve f and g for a specific purpose, discussed later in this chapter.) An example, 
with N = 5, is shown in Fig. 4.1. (Note that this is the solution obtained after 
the procedure to be discussed has been performed and the unknown coefficients 

obtained.) Equation (4.2) thus becomes: 


1 fe 1 x a ee 
Va oo [ oe [Sonia as (4.6) 


Basis functions 


The choice of the basis function is one of the most crucial parts of the MoM. A 
large variety of possible basis functions exists. Popular choices include functions 
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Figure 4.1 Equivalent line charge density for an N = 5 segment MoM solution using 
piecewise constant basis functions. L = 1m, V = 1 V,a = 0.001 m. 


with the following spatial variation: constant (also known as pulse or stair-step); 
linear; polynomial; piecewise sinusoidal; etc. Although deficient in some aspects, 
we will chose pulse basis functions for our introductory example. Each function is 
defined as: 
0 V2 <(n—l1)A 
hey = <4 V(n—1)A<2z' <nA (4.7) 
0 VnA <2z’ 
In other words, the nth function is unity in one segment (segment n) and zero 
elsewhere. 
Using these pulse basis functions in Eq. (4.5), and interchanging the order of 
integration and summation, one obtains: 


A h ! 2A h / 
4 €9V (z) =a, | NED ic, / 2) dz’ 


o R&,2z’) A RG&,2z’) 
NA h z! 
+ay / NED 55) (4.8) 
w-Ha RZ, 2’) 
This is one equation in N unknowns, viz. {a,,a2,...,an}. To obtain a unique 


solution, one requires N equations, or constraints.° 


3 Strictly speaking, these must be linearly independent equations. 
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4.2.3 Collocation 


To provide these N constraints, we enforce (match) the boundary condition at N 
points along the wire, z,,; this is also described as testing (sampling) V (z, p = a). 
This method is called collocation or point-matching. It is convenient to locate these 
points in the middle of each segment, in between the nodes: 


im=(n—1/2)A, m=1,2,...,N (4.9) 


Note that unlike the FDTD method, this “half-point” offset has no adverse effect 
on the accuracy of the method, and is not essential to its implementation; sam- 
pling points at other locations within the segment would also work, this is merely 
convenient. 

Sampling Eq. (4.6) at each of these N points, the following set of N equations 
is obtained: 


A h Zz! NA h z! 
0 ( 


R(z1, 2’) n-1)A R(1, 2’) 


A h ! NA h / 
sreovicw) =a [ Sa a eae / ANE) 97) (4410) 
( 


0 R(zn, 2’) n—-pa Rn, 2’) 


4.2.4 Solving the system of linear equations 


The above set of equations is a system of linear equations. At this point, it is im- 
portant to appreciate that the original integral equation inversion problem has 
now been reduced to a matrix equation inversion problem. It can be written 
as 


{V} =[Z]{7} (4.11) 


sometimes known as generalized network parameters. Square braces indicate a 
matrix, curled braces a vector. The relevant entries are: 
Vin = 41r€0V (zm) 


In = Qn 


nA 1 
Lin dz 4.12 
- Vee [(zm — 2’)? +a]! oe 


124 A ID introduction to the MoM 


The n subscript refers to source points; m refers to testing (sampling) points. Sym- 
bolically, the solution is 


{1} =[Z)'{v} (4.13) 


However, a linear system, usually written in the form [A]{x} = {b}, is almost never 
solved by inverting the matrix explicitly. Instead, the matrix [A] is factored into the 
product of lower and upper triangular matrices: 


[A] = [L][U] (4.14) 


Hence [L][U]{x} = {b}. An auxiliary vector {z} = [U]{x} is introduced, and then 
[L]{z} = {b} is solved by forward substitution to yield {z}; finally, {x} is solved 
from {z} = [U]{x} using backward substitution. (This process, an extension of 
Gaussian elimination, is generally covered in introductory undergraduate courses 
in numerical analysis.) 

There are a number of reasons for pursuing this rather than direct inversion of the 
matrix; the most important is that solving a linear system using LU-factorization 
has a cost ~O(N°?), whereas inverting a matrix costs at least twice this, since 
following the factorization N forward and backward substitutions are required, 
each of cost ~O(N7). 

Before the matrix equation can be solved, however, there is still one issue to 
attend to. In Eq. (4.12), the term Z,,, is given as an integral over the nth segment. 
This usually has to be done numerically using quadrature (numerical integration). 
In this specific case, analytical results are available [4, p. 674]: 


2. 2 
Sa Gane +(A/2) ) y 


m=n 
Znn =} In Gace Gas) vin #nbut|m—nj<2 415) 
dnn+ (din)? +a a 
d* 
In 4 Vim —n| > 2 
dinn 
with 
di, =Im + A/2 
din = ln — A/2 
Im = Vim — ny AP? + a? (4.16) 


The last parameter is the distance between the mth matching point and the center 
of the nth source segment. 
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Figure 4.2 Comparison of 5 segment and 100 segment solutions. 


4.2.5 Results and discussion 


Results are shown in Figs. 4.1 and 4.2. In Fig. 4.1, the piecewise constant nature 
of the basis function has been explicitly shown (the bar command in MATLAB 
provides a simple way of doing this). In Fig. 4.2, one observes that the N = 5 
solution is surprisingly accurate, although of course it does not correctly predict 
the behavior of the charge at the ends of the wire. 

A number of approximations have been made in this development. These in- 
clude the following, with the implications indicated. 


e An equivalent line charge was assumed. This relied on a rotationally symmetric charge 
distribution. For a thin wire, this is generally a very good approximation. 

e The ends of the wire were ignored; for instance, was the wire a hollow or solid tube? 
Again, for thin wires, this is a reasonable approximation. 

e In the collocation process, the integrals (which represent the boundary conditions) were 
only exactly enforced at N discrete points. In between these points, the potential will 
depart from the specified value. Fortunately, using more (i.e. smaller) segments will 
reduce the impact of this. 

e The specific basis function that was chosen — constant — is discontinuous at segment 
ends. Since we were approximating charge, which is continuous, this is non-physical. 
This is clearly evident in Fig. 4.1. (Again, the impact of this can be mitigated using 
smaller segments.) 
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e We assumed that the surface of the wire was perfectly conducting, so that the wire was 
an equipotential surface. For most good conductors, this is a very good approximation. 


The reason that we are discussing these in detail is that all these comments also 
apply to electrodynamics. 


4.3 Thin-wire electrodynamics and the MoM 


With these basics behind us, electrodynamics (or full-wave behavior) can now be 
investigated. The ideas of incident and scattered field decomposition are important 
here. Other than this, and the more complex equations, we will find the overall 
process very similar indeed. 


4.3.1 The electrically thin dipole 


The problem that we now want to solve is the current distribution J (z) on a straight 
thin wire. It is assumed here that the basics of the dipole radiator have already 
been studied. In such introductory courses on electrodynamics, some assumption 
is generally made regarding the distribution. For very short dipoles, a linear or even 
constant approximation of current can yield quite good results, and for the typical 
resonant dipole, the widely assumed sinusoidal distribution also produces useful 
results. However, the most obvious information which cannot be thus obtained is 
the reactance of the dipole. 

Although the overall process is very similar to the electrostatic charge distri- 
bution problem just worked out, there are two important differences. Firstly, the 
boundary condition: for a perfect electric conductor, the boundary condition is: 


Eun = 0 (4.17) 
We will use the incident/scattered field decomposition method. This was already 
introduced with the FDTD. To revise this briefly: since the Maxwell equations are 


linear, the fields may be decomposed into an incident field E'" and a scattered 
field E’“*, The overall field, called the total field E‘, is then: 


Et _ Eine i Eat (4.18) 


By definition, the incident field is the field which would exist if the scatterer were 
absent. As an example, if the incident field is a plane wave, propagating in the 
x-direction, in free space, with a z-polarized electric field, the expressions for the 
incident fields are: 


Eine = e Ikxs (4.19) 


oe 1 : 
Hie — ——e-Jkxs5 (4.20) 
"0 


4.3 Thin-wire electrodynamics and the MoM 127 


As usual, no = ./(40/é0 is the wave impedance of free space, and k = 27/Ao is 
the wavenumber. It is of interest to compare these expressions to those used in 
Section 3.2.3. The main difference is of course that these expressions are frequency 
domain ones. (A rather more minor difference is that the electric field is polarized 
in the Z-direction rather than .) On the surface of a PEC wire, Ee = 0. The 
boundary condition on the surface of the wire thus becomes 


Eine Be _ Fescat (4.21) 


As indicated above, Ei™® typically has a simple form. The scattered fields, Escat 
must be computed from the surface current. 

In general, the electric field can be computed from the magnetic vector potential 
A and electric scalar potential ® as 


E=-—jwA—V® (4.22) 


It will be recalled that various gauges can be applied to these potentials.4 The 
Lorenz > gauge is widely used in this context: 


V-A=—japoeo® (4.23) 


Applied now to the Z-directed surface current source, and assuming that the wire is 


in free space, so that €, 4 and the wavenumber k have the usual values in vacuum 6 
this becomes 
0A, . 
= —jJwLoeo® (4.24) 
Oz 
Hence, 
1 a°A 
ES" (7) = —j —— (1a. + =) (4.25) 
WLO€ Oz 
with 
1/2 p2x —jkR 
A, = "i : Jp", 2) ——a dd dz! (4.26) 
4n —1/2 JO R 
We have used the “free-space Green function” here (y(z, z’) = SS, which 


gives the resulting magnetic vector potential for a current element.’ R is the dis- 
tance from source to field point coordinates. Substituting Eq. (4.26) in Eq. (4.25), 


4 The potentials are not unique, and contain elements of arbitrariness, which the gauging resolves. 

5 More properly attributed to L. Lorenz than H. Lorentz. 

© This formulation is actually valid in any linear, isotropic and uniform medium, with jz and € taking the appro- 
priate values. For simplicity, we show only the free-space case. 
A more detailed discussion of Green functions is deferred to Chapter 7. 
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and integrating over the source region, one obtains 


Qn pl/2 P42 
=f a se as 2) Few. 2| J.(¢', za dd! dz! 


1/2 


ES" (r 


~ jen 
(4.27) 


Note that the differentiation in Eq. (4.25) has been taken inside the integral oper- 
ator. This is valid since the differentiation is with respect to the field point coordi- 
nates, and the integration is over the source points. 

At this stage, the unknown is still the Z-directed (by assumption) surface cur- 
rent J,(@’, z’). For sufficiently thin wires, this can be reduced to the Pocklington 
equation, first introduced in 1897: 


1/2 pa2 
ES) == | [AS + Ave, Z| f(a de 


joe J-1/2 dz? 
= -E‘(r) (4.28) 


This equation is obtained by assuming that (as for the electrostatic case), we locate 
the filament on the axis and enforce the boundary condition on the surface (the 
reciprocal case is sometimes more convenient in deriving this). Although it looks 
fairly straightforward, the presence of the second derivative of z inside the integral 
kernel, acting on the Green function, makes this non-trivial to implement. A useful 
further simplification can be made if the wire is assumed very thin (a < A): 


1/2 e—JkR ; 
/ Ike! - male + jkR)(2R? — 3a”) + (kaR) *) az’ = —jweE,(p = a) 
—1/2 


(4.29) 


with a the wire radius and R = \/a2 + (z — z’)?. This is now a convenient form 
to program. It appears in numerous texts (for example, [4, p. 720]) and appears to 
have been first introduced by Richmond [5] (reprinted in [6]). 

Further discussion on these and other integral equations (such as Hallén’s) may 
be found in [4, 7]. 

Before solving this numerically, recall that we are assuming the following. 


e Circumferential currents are negligible. 

e The axial current [(z’) does not vary circumferentially. (This is not the same as the first 
assumption!) 

e As for the electrostatic case, we locate the filament on the axis and enforce the boundary 
condition on the surface, or the reciprocal case. 


The reason that we offset the source filament and testing surface (or vice versa) 
is, as in the electrostatic case, to avoid the singularity present at z = z’. Although 
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approximate, this method works well for thin wires. As for the static case, the ker- 
nel is not singular, but for small a can become more nearly so than in the electro- 
static case — the R° term in the denominator of Eq. (4.29) is largely responsible — 
and more sophisticated treatments are frequently used. The problem usually oc- 
curs with the “self” term (the element of [Z] with m =n). The usual remedy is 
to subtract a term with the same order of singularity but which can be integrated 
analytically, and then to integrate numerically the difference between the singular 
term and the remainder, since this is usually quite well behaved. Examples of this 
type of treatment of singular integrals will be discussed in Chapter 7 (although in 
a slightly different context). 


Approximating the current 


The same idea is used for approximation of the current as we used for charge, 
namely some sort of discrete approximation using a set of functions of known 
shape but unknown amplitude. The most widely used basis functions are pulse 
(piecewise constant, as used for the electrostatic problem), triangular (piecewise 
linear) and piecewise sinusoidal. An especially convenient form arises when piece- 
wise sinusoidal basis functions are chosen. In this case, for a wire ¢ in length, lying 
on the z-axis from —£/2 to £/2, the nodes are defined as 


Zn = —£/2+(n—-1) A, n=1,2,...,N+41 (4.30) 
L 


A= (4.31) 


The basis function on the nth segment is: 


Tn Sin k (Zn 41-2) +In41 sin k(z—Zn) Vz —Z | <A 
n| > 


is = sinkAzy 
n(Z) 0 otherwise (4.32) 


It actually consists of two parts, with two associated (and unknown) coefficients 
I, and [,41. It is often convenient to reinterpret the function as spanning two 
segments, with one associated coefficient [,,. With this interpretation, it may be 
shown ® that the 2-directed scattered field from the nth basis function is given by: 


psc — 30 | eo 7 e JERn sin K(Zn41 — Zn—1) 
iz J Ry-1 sink (Zy _ ZH=1) Rn sink (Zp = Zn—1) sin k(Zn41 _ Zn) 
eo FKRn+1 
, (4.33) 
Rn+1 sin K(Zn+1 ~~ =| 


8 A detailed derivation of this was given in the first edition of Stutzman and Thiele’s antenna text [8, p. 330], but 
was removed from the second edition [7]. 
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The lengths R,—1, Ry, and R,+1 are respectively the distances from nodes n — 1, 
nandn + | to the field point. 

With this particular choice of basis function, the integrals can be carried out 
analytically, and this has been quite widely used in MoM codes. Note that at the 
ends of the wire, the terms /p and /y+1 are ignored, essentially forcing them to 
zero (which is the expected behavior of the current). 

As for the electrostatic case, a linear system is assembled using the results for 
the field scattered by each segment. The simplest “testing” scheme is again collo- 
cation: this is most conveniently done at the nodes in the case of sinusoidal basis 
functions, which are in the center of the basis functions as defined above. 


The incident field 


It is important to realize that an MoM problem requires some form of excitation 
(in the same way as an FDTD model, for instance); commercial codes are no ex- 
ception. A key difference between the electrostatic and electrodynamic cases is the 
concept of the incident field, as already outlined, which provides this excitation. 
For an incident plane wave, peak value Ey) V/m normally incident on the z-directed 
dipole (along the x-axis in this case), the expression is: 


BS Byes (4.34) 


as already discussed. 
For an antenna problem, a very simple form of feed is the “delta-gap”’; in this 
case 


E™ —+V/65 (4.35) 


for an impressed voltage of V at the terminals of the antenna and gap length 6 
(quite often, the length of the segment). This source is also sometimes placed at 
the node between segments. The sign depends on the convention adopted regarding 
voltage. For the basis functions discussed, the direction of positive current flow is 
from node n ton + 1. 

More realistic models are available, such as the “frill” source. This models a 
coaxial line, whose center conductor becomes a monopole and whose outer con- 
ductor opens into an infinite ground plane. In this case, the electric field on the 
axis of the Z-directed monopole is (again, similar comments pertain regarding the 


sign): 


; V eTIKRi pik Ro 
Eee = £——___ | —__ - (4.36) 
2 In(b/a) Ri Ro 
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Figure 4.3 Current on a resonant dipole computed with the MoM using piecewise sin- 
suoidal basis functions and collocation. L = 0.474, a = 0.005A, with N = 60 segments. 


with 


Ro = V22 +B? (4.37) 


where a and 5b are the inner and outer radii of the coaxial feedline. V is the ter- 
minal voltage. Usually, this is used an an equivalent model, in which case a is 
the radius of the wire and b is then chosen as some reasonable value — often the 
equivalent characteristic feedline impedance Zo = 60 In(b/a) is chosen as 50 Q, 
Le. b © 2.3a. It is worth commenting that the current (and hence antenna terminal 
impedance) is very little affected by this value. 


Some computed results 


An example for the current distribution on a thin resonant dipole (L = 0.47A, 
a = 0.0052) computed using the MoM is shown in Fig. 4.3. This MoM code, im- 
plementing the theory in this chapter in MATLAB, uses piecewise sinsuoidal basis 
functions and collocation. Results are shown for both the delta-gap and magnetic 
frill sources, using N = 60 segments. The impedance computed with the former 
was Z; = 76.7 + j4.7 Q and for the latter, Z; = 74.8 + 78.2 Q. Considering the 
relative simplicity of the approximation, this agreement is excellent. An even bet- 
ter comparison is to look at the magnitude of the reflection coefficient ['; a 75 Q 
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system is appropriate here (and was also used for the equivalent coaxial radius in 
the frill model); the results are —29.7 dB and —25.3 dB respectively. Anyone who 
has ever tried to measure the reflection coefficients of antennas will be aware that 
such agreement is more than satisfactory. 

However, this computed result appears better than it actually is! What is 
not shown on Fig. 4.3 is that the magnetic frill source converges very slowly; 
60 segments corresponds to a sampling density of around 120 per wavelength, 
approximately an order of magnitude times the usual rule-of-thumb for full-wave 
MoM codes. Using N = 6, the delta-gap model produces Z, = 62.1 — j67.8 Q; 
the real part is moderately accurate although the reactive part is not; however, the 
magnetic frill prediction, Z; = 13.8 — j15.1 Q, is unconverged and entirely mis- 
leading. In Chapter 5, we discuss checking convergence of computed data in some 
detail. Commercial codes use somewhat more sophisticated treatments than those 
discussed here to obtain more rapid convergence. 


4.3.2 A caveat regarding thin-wire formulations 


An important point to note with thin-wire formulations is that they admit no exact 
solution, and exhibit a phenomenon known as relative convergence: as the number 
of unknowns in an MoM solution is increased, the solution converges initially to 
a value close to the exact solution (what has been called the region of rapid ini- 
tial convergence), then enters a stable region, and finally diverges in a region of 
instability. For wires which are too thick for effective use of the thin-wire approx- 
imation, there is no stable region at all. This was considered in detail by Collin [9] 
(reprinted in [6]) and is also discussed in his textbook [10]. 


4.4 More on basis functions 


Suitable basis functions were the topic of research for many years, and in this 
section, some details are provided of two other solutions which have been widely 
adopted. Firstly, it is appropriate to provide some background on a public domain 
code, NEC-2, which for many years was the workhorse of MoM computation. The 
basis function used by NEC-2 had some particularly elegant features. Following 
this, some more details are provided on piecewise linear basis functions, which are 
also very popular. 


4.4.1 The numerical electromagnetic code (NEC) — method of moments 


It would be inappropriate in a book of this nature not to include some discussion 
of NEC, or NEC-2 in the case of the public domain version. This code has a long 
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lineage, with its genesis in a code called BRACT (released in 1970), which was de- 
veloped by contractors MBA Associates, primarily for US Air Force applications. 
The code that eventually became NEC started as the AMP (Antenna Modelling 
Program), first released in 1974, again with US military funding. A discussion 
of the theoretical background and a number of applications for what is clearly 
this code (although unnamed in the article) may be found in [11], available in the 
collection [6]. 

NEC-1 was released in 1977, and NEC-2 in 1981. NEC-2 became, and still is, 
the most widely used public domain MoM code.? NEC-3 was an intermediate ver- 
sion, and saw only limited distribution; NEC-4 was released in 1992 and was the 
last major release of the code, which is no longer being actively developed fur- 
ther. Until very recently, NEC-4 was still US Military Restricted technology,!° 
although all NEC-4 functionality is now available in commercial codes (most 
prominently FEKO). The various NEC codes were developed at the Lawrence 
Livermore National Laboratory, one of the major US government research lab- 
oratories. Here, we will focus on NEC-2, owing to its ready availability; despite 
its even more venerable age, it is still a useful tool and quite widely used as a 
benchmark. 

NEC-2 incorporates the Pocklington integral equation formulation for thin 
wires, as well as a treatment for closed conducting surfaces (the magnetic field 
integral equation, which will be discussed in Chapter 6). It includes support for 
a number of features very useful in modelling wire antennas, including: non- 
radiating networks (e.g. transmission lines); lumped element loading; perfectly or 
highly conducting wires; incident plane-wave or voltage sources; and treatments 
of perfect or imperfect grounds. The last included the Sommerfeld formulation 
for half-spaces; this will be discussed in Chapter 7. It can compute induced cur- 
rents and charges; near- and far-fields (electric or magnetic); radar cross-section; 
antenna impedance (and admittance); gain and directivity; and antenna to antenna 
coupling. It can exploit symmetry of rotation or reflection. 

NEC-2 was primarily developed for wire antenna modelling, and many of the 
problems which have been reported with NEC-2 arose because users tried to use 
it for modelling surfaces via meshes of wires. Although one can obtain useful 
answers with careful work with this approach, it is not the purpose for which the 
code was primarily designed. Provided that NEC-2 is used within its limits, it is 
still a very useful code. 


9° Whether it was indeed the intention of the US government to make the code public domain is still not entirely 
clear, but this became the de facto situation by the 1990s. 
10 Since 2003, NEC-4 has been available for a very modest license fee for users in most countries. 
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4.4.2 NEC basis functions 


Much of the success of NEC was due to the basis function used. (In the follow- 
ing discussion, NEC and NEC-2 will be used interchangeably; this theory is also 
applicable to NEC-4.) A highly desirable requirement of a good basis function is 
that it should satisfy physical requirements of current and charge continuity. This 
implies that both the current and its first derivative should be continuous. NEC 
makes the usual thin-wire approximations, viz. transverse currents are negligible; 
the circumferential variation of current is negligible; current can be represented 
by filament on the wire axis; and the boundary conditions on the tangential elec- 
tric field are only enforced axially, so the basis function is one-dimensional, as 
in our preceding discussion in this chapter. In developing the basis function, the 
following interpolation function is first introduced for segment /: 


I; = Aj + Bj sink(s — s;) + Cj cosk(s — sj), V |s —sj| < Aj/2 (4.38) 


The parameter s is a local coordinate along the length of the wire, with s ; the value 
of s at the center of segment j. A; is the length of segment j. This is based on a 
function originally proposed by Yeh and Mei [12] (reprinted in [6]). Although this 
is quite often described loosely as the basis function, this is not entirely correct. 
The full basis function is rather more complex. Each NEC basis function spans at 
least three segments: central, left (minus) and right (plus), supporting interpolation 
functions of the form of Eq. (4.38) on each segment, fae i and ca respectively. 
The double subscript is used to identify the jth segment connected to segment 7. 
Figure 4.4 shows the situation for a wire segment with two wire segments connect- 
ing to the left and two to the right of the central segment. (In this case, the basis 
function “centered” on segment i spans all five segments.) For a straight wire, 
with only one segment on the left and one on the right, one can drop the double 
subscript and the basis function comprises interpolation functions he f, and fi 


L 
associated with it (nine unknowns in total) — each interpolating as Eq. (4.38). Fora 


1~ 17 


Figure 4.4 Segments covered by the ith basis function. 
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wire junction as in Fig. 4.4, there are contributions from five segments (and hence 
15 unknowns). 
The unknowns are now reduced to one per segment by the following constraints: 


(1) The current must go to zero at outer edges of connected segments. 

(2) The derivative of the current must go to zero at outer edges of connected segments. 

(3) The current must be continuous at a segment junction. 

(4) At a segment junction, the charge must satisfy a condition known as the Wu—King 
condition; it is continuous for a straight, uniform wire. 


These conditions are then enforced on each individual basis function — these are 
sufficient (but not necessary) conditions to ensure current and charge continuity, 
since the final approximation of current is a linear sum of these basis functions. 
This was a crucial insight. 

For example, these constraints for a segment in a straight wire are as follows. 


(1) One from end 1~ and one from end 1°. 

(2) Again, one from end 1~ and one from end 17. 

(3) Two (one at each end of the central segment). 

(4) Four (one each from the segments connected to the — and + ends, two from the central 
segment itself). 


This amounts to ten constraints. From Eq. (4.38), there are three unknowns per 
interpolation function, and three such functions, making nine unknowns. A charge- 
related parameter at the segment junctions provides two additional (“invisible”) 
unknowns, producing eleven unknowns per wire segment (more details on this 
are given below). The ten constraints are then applied to yield one unknown per 
segment, which is arbitrarily chosen as —A, i.e. the coefficient associated with 
the constant part of the interpolation function centered on segment i. The details 
of this process are quite lengthy, and are available in [13]. 

The advantage of this formulation is that it can be generalized to handle multi- 
wire connections. Although it appears complex (and indeed the implementation is 
non-trivial), it is handled entirely within the code and the user is unconcerned with 
the details. 

NEC-2 can also handle junctions involving wires of different radii. The so-called 
Wu-King condition (an attempt to enforce the continuity of scalar potential, which 
is the correct quasi-static continuity condition) is applied at each junction: 


al (s) Q 
as ~ In(2/ka) — y 


at junction 


(4.39) 


In this expression, y = 0.5772, Euler’s constant. Q is an unknown related to 
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charge: it is constant for all wires at a junction and is the “invisible” unknown 
in the previous discussion. 


4.4.3 Piecewise linear basis functions 


The NEC-2 basis function is very useful for modelling wire antennas, but is dif- 
ficult to apply when the structure to be modelled comprises large amounts of 
conducting surfaces. We will discuss effective methods for modelling surfaces in 
Chapter 6; at present, all we need to know is that the usual basis function for 
this is piecewise linear. Hence, such basis functions are very convenient for mod- 
els including both wires and surfaces. The formulation is very similar to that of 
Eq. (4.32): 


In Zn41—-2Z)+In4.1 (Z—Zn) 
n&n+l 7 n+l n Viz — Z| < A 


hy(zZ) = 
@) 0 otherwise (4.40) 


As with the piecewise sinusoid, the basis function consists of two parts, with two 
associated (and unknown) coefficients J, and /,41 , and again, it is often conve- 
nient to reinterpret the function as spanning two segments, with one associated 
coefficient J,,. This idea is very useful at wire junctions. 


4.4.4 Junction treatments with piecewise linear basis functions 


The NEC junction treatment is sophisticated, but a simpler approach first intro- 
duced by Chao and Strait in 1970 is worth mentioning, since it is still quite widely 
used. The only place the proof appears to have been published is a report for a 
government research laboratory [14, pp. 22—25] and given that these are frequently 
rather difficult to obtain, even when unlimited distribution was approved as was the 
case here, it is worth briefly deriving their approach. A description of the method 
(without proof) appears in [15, Chapter 4]. Chao and Strait used a slightly more 
complex variant of the piecewise linear function, with an interior node in each 
segment to permit better approximation of curved wires; here, we use straight seg- 
ments. 

Consider a three-wire junction at node n, as shown in Fig. 4.5. (The method 
works for any number of wires, but this keeps things simple. The general case 
is outlined at the end of the discussion.) Firstly, we introduce a “half-triangle” 
function of the form 


In(S=Sn—1) 


iOS ; x Vs —Sn-1 < A 


otherwise (4.41) 
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Figure 4.5 A three-wire junction. 


This is simply half the basis function defined in Eq. (4.40), but with z replaced 
by s, a local distance parameter along each wire. In this discussion, it is conve- 
nient if s = 0 corresponds to the end of each wire away from the junction, with 
s increasing as one approaches the junction. There are three currents to consider: 
the current on wire 1, just before the junction; and the same for wires 2 and 3. 
Note that at these points, the only basis functions contributing to the current are 
these half-triangle functions. We will call the corresponding coefficients [,, /2 and 
13. (We will not include the node n in the notation since it is unnecessary here.) 
With only two wires, it is sufficient to set 7; = —J5 and Kirchoff’s current law is 
automatically satisfied. (The negative sign is due to the convention on s adopted 
above.) With three wires, one possibility is to allow the MoM procedure to include 
I,, In and Js, and then impose the additional constraint 


ht+th+kh=0 (4.42) 


However, this often results in a constraint equation with very different magnitudes 
to the usual impedance matrix elements. 

The approach suggested by Chao and Strait is to consider each half-triangle 
coefficient as the sum of two components, hence: 


h=N4+r, 
h=K+l; 


h=4+l; (4.43) 
Further, they propose that 

=i 

i= 1 

Iy = 1, (4.44) 


What this implies is that these basis functions are simply the usual piecewise 
linear basis functions, spanning both the last segment of the relevant wire and 
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Figure 4.6 The three-wire junction, with the wires overlapped. 


ee Wire 2 
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Figure 4.7 The final junction treatment, with two overlapped wires, and one not. 


overlapping by one segment onto the next wire: that is wire 1 overlaps onto wire 2, 
wire 2 onto wire 3, and wire 3 onto wire 1. This is shown in Fig. 4.6. (Our previous 
comment regarding sign convention applies here too.) Substituting Eq. (4.44) into 
Eq. (4.43), 


h=1,- 
h=h-T, 
kh=1,-1, (4.45) 


one notes that this choice identically satisfies Eq. (4.42) for any values of I}, 15 
and /;. A unique solution is obtained by arbitrarily choosing one of the degrees of 
freedom; it is convenient to set 1; = 0. This yields 


h=r 
b=I,-T 
h=-l (4.46) 


A little thought shows that this implies that we overlap wire 1 onto wire 2, wire 2 
onto wire 3, but do not overlap wire 3 onto wire 1, as in Fig. 4.7. For a gen- 
eral N wire junction, the procedure is to overlap wire n onto wire n + 1, but not 
wire N onto wire 1. Each of these overlapped wires is then treated with the usual 
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MoM procedure as an open wire, with zero current at the end, as is the one non- 
overlapped wire. 

This is a somewhat cruder approximation than in NEC-2, since it satisfies only 
Kirchoff’s current law, and not the continuous scalar potential. However, for junc- 
tions involving wires of the same or similar radius it works satisfactorily. It also 
incorporates an element of arbitrariness, since which wire is not to be overlapped 
can be chosen at will. Finally, note that this procedure also works with piecewise 
sinusoidal basis functions. 


4.5 The method of weighted residuals 


Even at an introductory level, one cannot leave the subject of the method of mo- 
ments without introducing a very important extension. It was commented that the 
point-matched procedure which was used only enforced the boundary condition at 
the sample points. A method generally known in the applied mathematics literature 
as the method of weighted residuals provides a systematic method for improving 
this. Before we do this, some notation needs to be introduced first. We return to 
Eq. (4.2), repeated here for convenience: 


1 f* e@) 
VZ,p=a)y= i dz’ 4.47 
PHO = Frey Jy RZ) ite 
and introduce linear operator notation 
Lf=ge (4.48) 


where CL is the operator which maps function f to function g. In the case of 
Eq. (4.47), for instance, the function f is the charge p; the function g is the voltage 
on the wire; and the linear operator CL is 


ae a | ; 
i (-) dz (4.49) 
0 


~ Areg Jy R(z, 2’) 


The bracketed dot is used as a place-holder for the function on which this operator 
acts. Using this notation, the previous development then produces 


N 
LY algae (4.50) 
A=1 


where, as before, f has been approximated using the basis functions, viz. 


N 
f > anhn 
n=1 
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Using point-matching, the N x N linear system can be obtained by testing the 
above at N test points. But now, instead of doing this, we form the residual as: 


N 
R=LY anhn - 8 (4.51) 
n=l 


This residual is the difference between the approximate solution and the actual 
solution. (At the risk of belaboring the obvious, if this was one of the very rare 
problems which can be solved exactly using the MoM procedure, then the residual 
would be zero.) The point-matching procedure forces this residual to zero at N 
discrete points. A better approach would be to try to obtain some type of average 
value of the residual over the domain of the problem (the length of the wire in this 
case), and set this to zero. One can do this in a quite general fashion by introducing 
the idea of a weighting function, which is multiplied by the residual (and hence the 
name, method of weighted residuals) and integrated over the domain. The weight- 
ing function (also often known as a testing function) is also usually expressed as 
some type of finite series: 


M 
w=) win (4.52) 
m=1 


In this case, the equality is appropriate, since we are not approximating this func- 
tion. Note also that there are no unknown coefficients. Symbolically, the weighted 
residual method becomes 


M M N M 
[RX wae = f Y wn£ Yann = f Yo ong = 0 (4.53) 
L m=1 L n=l m=1 Li=l 


Usually, the number of basis functions (NV) and the number of weighting func- 
tions (M) are equal. Because this integration process frequently defines an inner 
product, an equivalent notation frequently encountered is 


(Wm, Lanhn) = (Wm, 8) (4.54) 


This is of course the bracket notation widely used in quantum mechanics, for the 
matrix algebra formulation of Heisenberg. We will not pursue this further, other 
than to note that the reason for this analogy is that both classical electromagnetics 
and quantum mechanics are at heart field theories. 

It is easy to show that the method of weighted residuals produces a matrix equa- 
tion, of the same form as Eq. (4.11), repeated here: 


{V} = [Z]{7} (4.55) 
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except that the matrix entries are now 
Zmn = (Wm, Lhy) 


Vin = (Wm, &) 
In = an (4.56) 


In addition to the question of which type of basis functions to adopt, one now can 
also choose a variety of weighting functions. This matter has been quite extensively 
researched. In practice, however, there are two very popular choices. The Galerkin 
procedure uses the same basis and weighting functions. The collocation method, 
which we have already studied, uses Dirac delta functions, which of course reduce 
to just testing the operator at the sample points. 

Before concluding this section, one or two points which can (and have) caused 
confusion in the past should be highlighted. Firstly, the inner product implied 
above for two functions f and g defined on domain PD is: 


aoe i: fedVv (4.57) 


For real valued functions, the operation thus defined satisfies the mathematical re- 
quirements of an inner product. However, for complex valued functions, it defines 
a symmetric rather than inner product, and in this case, the Galerkin procedure re- 
quires weighting functions which are the complex conjugate of the basis functions. 
(The symmetric product defines a quantity known as reaction in electromagnetic 
theory [7, Section 10.7; 4, Section 7.6].) A valid inner product for complex-valued 
functions is: 


gee i fe* dV (4.58) 


where g” is the complex conjugate of g. In this case, the basis and weighting func- 
tions are identical in the Galerkin procedure. Heated debates have arisen over this 
in the literature; mathematically, it is important, because functions and operators 
defined within the framework of a proper inner product (and also with some ad- 
ditional properties) are known to be elements of Hilbert and/or Sobolev spaces, 
which confer various properties, important with regard to error analysis and con- 
vergence studies, on the problem. In practical engineering applications, the differ- 
ence is usually unimportant. 

On a different topic, the use of Dirac delta functions to derive the collocation ap- 
proach from the method of weighted residuals has been criticized by some writers 
[16]. The core of this criticism is the observation that functions such as these are 
only properly defined in a distributional sense (i.e. under an integral sign). Again, 
whilst valid from a theoretical viewpoint, in practice the collocation method stands 
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on its own merits, does not need to be derived thus, and is often a very effective 
formulation. 

One final point we can now explain — the origin of the name “method of mo- 
ments.” Again, consider a one-dimensional problem, such as the electrostatic one 
we started the chapter with. If we use a method of weighted residuals approach, 
but select as weighting functions the set {z, one Ae .} we form the moments of 
the residual. In applied mathematics, the method of moments is this specific form 
of the method of weighted residuals. Harrington chose it as the generic name for 
method of weighted residuals approaches in electromagnetics, and the name stuck. 
(In [17], he explained that when first working with the method, he tried to avoid in- 
troducing new jargon, and that the name method of moments had previously been 
used by the Russian mathematicians Kantorovich and Akilov.) Arguably, it may 
not have been the best choice of name, but four decades of usage in computational 
electromagnetics have established it so firmly as to be beyond debate. One will also 
sometimes find the term boundary element method used instead of MoM; usually, 
these terms are identical, although we caution that volumetric MoM formulations 
are available which are not boundary, but rather volume, element methods. (We 
briefly discuss volume elements in Chapter 6.) 


4.6 Further reading 


Although elegant theoretically, the MoM is probably the most difficult formulation 
of those presented in this book to implement accurately and efficiently. In the next 
chapter, we will turn our attention to the use of commercial codes, and not attempt 
to develop the simple codes presented in this chapter further. For those intending 
to develop codes themselves, the MoM is surprisingly badly served by textbooks 
for applications significantly more advanced than the introductory level treatment 
presented here, and the following notes may be of use. 

Firstly, one still needs to refer to some of the original papers on the topic — 
there is no MoM equivalent of the books by Silvester and Ferrari or Jin on the 
FEM [16, 18] or Taflove on the FDTD [19]. In this context, the original paper by 
Pocklington [20] is both still available in specialized libraries, and still interesting 
reading, although it will be of little help in developing an MoM code. 


An historical aside — H. C. Pocklington 


Reading scientific papers from this age can be a little humbling for modern re- 
searchers. At the same meeting of the Cambridge Philosophical Society where 
Pocklington presented his work (25 October 1897), a paper by C. T. R. Wilson 
on his cloud chamber was presented. At other meetings of that year, numerous 
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papers appear by J. J. Thomson. 1897 was of course the year that Thomson 
announced the discovery of the electron at the 30 April, 1897 meeting of the 
Royal Institution — although he called it a corpuscle at that time. Pocklington 
was a fellow of St. John’s College, Cambridge, and during a sabbatical visit to 
Cambridge the present author tried to obtain more details about his life. Sadly, 
no photograph or any other information about him was available, unlike Thom- 
son, who went on to become Master of Trinity College, Cambridge, one of the 
most prestigious positions at that University, as well as of course winning the 
Nobel prize. Both Trinity and St. John’s have a proud tradition of scientific ac- 
complishment, Trinity numbering Newton and Maxwell amongst its fellows in 
addition to Thomson, and St. John’s Dirac. 


The collection of reprints edited by Miller et al. is very useful in this context, 
over a decade after publication [6]. It contains a number of seminal papers, many 
of which have been referenced in this chapter, as well as a translation from the 
original German of an important basic theoretical paper by Maue [21], dating back 
to 1949, which derived what have become known as the electric/magnetic field 
integral equations, discussed in Chapter 6. The original text by Harrington [22], 
although reprinted on several occasions and still very widely referenced, is not 
particularly useful when implementing complex RF simulation codes since its fo- 
cus is more on basic concepts. However, several important chapters in the now 
hard to find [23], such as [24], are of considerable interest when implementing 
complex wire codes, and this still appears to be the only comprehensive derivation 
available of the magnetic field integral equation as generally used; this work gen- 
eralized some aspects of Maue’s original derivation. Another hard to find reference 
with useful information on MoM procedures for arbitrarily oriented wire antennas 
is [25]. In this context, Moore and Pizer’s monograph [15] was useful in its time, 
but unfortunately has never been revised and may be difficult to locate. Finally, 
another useful source on this topic, which should be far easier to obtain, is the the- 
ory manual for NEC-2 [13]. Good introductory treatments of the MoM for antenna 
applications are available in [4, 7, 26], which provide a somewhat more extended 
coverage of the subject than in this chapter; however, these are by no means fully 
comprehensive treatments. The only extended text on the MoM is Wang’s [27], and 
the book has some material which has dated quickly, specifically in the context of 
a controversy then raging in the literature about iterative methods. Peterson et al.’s 
book [28] has a good theoretical treatment of canonical problems, but as with the 
introductory MoM treatment in the antenna textbooks mentioned above (and also 
Wang’s volume), it does not deal with the complexities of arbitrarily oriented wire 
antennas, providing only a brief overview of the topic. 
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Finally, the question of the convergence of the MoM has proven far from trivial; 
a brief discussion may be found in Appendix C. 


4.7 Conclusions 


Although highly simplified, the theory discussed in this first chapter on the MoM 
is at the core of very complex and powerful MoM programs such as NEC-2 and 
FEKO. The former uses collocation, with a variant of the sinusoidal basis func- 
tion as discussed; the latter uses a Galerkin formulation with piecewise linear ba- 
sis functions, also as discussed. Extensions to arbitrarily oriented wire antennas 
rapidly become complex, due to the presence of different components of the elec- 
tric field (set up by the arbitrarily oriented currents) which need to be taken into 
account. Highly (as opposed to perfectly) conducting metallic structures can also 
be addressed with very similar theory. NEC-2 was one of the first codes to incor- 
porate a large number of such facilities; modern commercial codes such as FEKO 
incorporate all these, as well as many other powerful analysis capabilities. 

In the next chapter, we will look specifically at the use of FEKO and NEC-2 
for wire antenna modelling. Following this, we return to more theoretical topics, 
considering modelling highly conducting surfaces in Chapter 6, as well as hybrid 
formulations to reduce the computational cost of this, and we conclude our study of 
the MoM in Chapters 7 and 8 with a discussion of Green functions, stratified media 
formulations, and the Sommerfeld potentials. In Chapter 10, we will introduce a 
very powerful hybrid of the MoM with the finite element method, which permits a 
very efficient solution of certain classes of problems. 
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5 


The application of the FEKO and NEC-2 codes 
to thin-wire antenna modelling 


5.1 Introductory comments 


With the theoretical background now established, one is in a position to start using 
commercial and public domain MoM programs intelligently. In this chapter, we 
will discuss primarily the application of the commercial code FEKO for antenna 
modelling, but will also discuss the use of the public domain code NEC-2! in this 
regard. Other than FEKO, few commercial programs (other than some proprietary 
NEC-2 extensions) provide good support for modelling thin-wire antennas, the 
topic of this chapter; such antennas are still very widely used indeed. For com- 
mercial programs, material is usually available to assist novice users to get started 
with the codes.” Hence we will not describe the basic concepts of entering the 
geometry of the problem, including the source, and specifying parameters such as 
operating frequency and radiation patterns, since these vary from program to pro- 
gram, indeed quite often from release to release, and are usually quite well docu- 
mented by the suppliers. However, in the case of NEC-2, some comments are in 
order. 

NEC-2 is a “card driven” program, dating back to the days of “decks” of 
punched cards. A NEC model is described by a geometry file, usually witha .nec 
extension. An example is given in Fig. 5.1. If using NEC in this form, one must 
obtain a copy > of the usual manual [1]. Each line in this file describes either a geo- 
metrical element or an analysis operation; the first two lines are simply comments; 
the third line GW is a straight wire, with a tag of 1 in this case (a tag is a number 
referring to the particular wire, and is used to simplify later references), divided 
into 41 segments, with (x, y, z) coordinates of the first end (0, 0, —0.25), of the 


! Again, as in the previous chapter, we will use NEC-2 and NEC interchangeably in this chapter. All the com- 
ments made are equally applicable to NEC-4. 

2 In the case of FEKO, a “Getting Started” manual is provided. 

3 This has been made available on the Internet. See Appendix F for a list of websites which can assist in this 
regard. 
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CM Dipole Example 

CE Start of geometry 

Gw ol 41 0.000000 0.000000 -0.250000 0.000000 0.000000 0.250000 0.00500 
GE 0) 

FR 51 0) 0 250.00000 2.0000000 

EX a 21 00 1.00000 0.00000 

XQ 
EN 


0 
0 
0 
0 
Figure 5.1 A sample NEC input file. 


second end (0, 0, 0.25) and radius 0.005. All dimensions are in meters by default. 
The fourth line GE indicates that the geometry section has ended. The fifth line 
FR specifies the frequency; the sixth, EX, specifies a voltage-source excitation on 
the 21st segment of the wire with tag 1; and the penultimate line, XQ, executes the 
program, computing input impedances and (possibly) radiation patterns. The final 
line EN ends the “deck.” 


Code tip — using NEC-2 


NEC-2 is only the computational engine, originally written in one of the ear- 
lier versions of FORTRAN, which performs the MoM computations as specified 
in the input file, and writes data to an output file. No graphical support is pro- 
vided at all. An entire industry grew up providing such support; some packages 
are fully featured commercial products with major additional computational fea- 
tures, such as SuperNEC; others, such as Wiregrid for Windows, are freeware, 
providing only graphical user interface (GUI) support. Wiregrid does have one 
feature worth highlighting; it is able to generate wire-mesh approximations of 
surfaces, using an algorithm published in [2]; no other NEC-2 GUI appears to 
support this at the time of writing. (Generating such a mesh by hand is an in- 
credibly tedious operation.) 

Although not clear from Fig. 5.1, the column spacing can be crucial — i.e. 
the x coordinate of end 1 must be entered between columns 11 and 20 for some 
versions of NEC-2. There are many slightly different versions of the code, com- 
piled by different authors, and the earlier versions had limited parsing ability on 
data files. Later versions relaxed this, and also permitted the use of commas to 
demarcate data fields. One is well advised to get one of the many GUI inter- 
faces mentioned above, since otherwise preparing a NEC-2 data file can be very 
frustrating indeed. 

An advantage of the NEC-2 open-source mode of operation is that it lends 
itself to use in a variety of applications — optimization, for instance — since it 
is relatively easy to generate NEC-2 input files automatically, and using tools 
such as grep, the output file can be parsed for the required output parameters. 
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However, this is not an operation recommended for beginners. In some cases the 


code has even been partially or entirely rewritten in other languages — part of 
the present author’s doctoral dissertation was an implementation in a language 
called Occam, to permit efficient parallelization of the code [3]. 


FEKO was also influenced by NEC; at the time of writing FEKO still referred 
to “cards” in the input file. The actual input file used by FEKO has a . fek ex- 
tension, and consists of lines of data, usually preceded by a two-letter label. (It 
is either in ASCII or binary format; the former is advantageous when generating 
geometry files on a PC for running on a more powerful computer such as a work- 
station or even supercomputer.) However, this is a very difficult format for users to 
comprehend. At the time of writing, FEKO was usually run from a PREFEKO file 
(with extension . pre). PREFEKO is a type of scripting language which generates 


the . fek file from elementary geometrical and other primitives.* 


The code FEKO 


This code had its genesis in the doctoral work of Jakobus at the University of 
Stuttgart in Germany during the early 1990s. It is an acronym of the German 
name: “FEldberechnung bei K6rpern beliebiger Oberflache”, which translates 
as field computations involving bodies of arbitrary shape. It incorporates a pow- 
erful MoM treatment using piecewise linear triangular functions for metallic 
structures — both wires and surfaces. It also supports the MoM treatment of di- 
electric structures, using either surface or volumetric treatments. A unique fea- 
ture of FEKO is the approximate hybrid treatment available using physical op- 
tics. We will discuss many of these topics in Chapter 6. FEKO is available across 
a wide range of platforms, including supercomputers. The code ships with a 
very usable GUI (although this was being redesigned at the time of writing). Re- 
cent additions have included the Sommerfeld treatment for stratified media (the 
topic of Chapters 7 and 8) and the fast multipole method (see Chapter 6). The 
code is very popular in Europe, and is starting to penetrate the USA and Asian 
markets at the time of writing. A version with restricted capabilities (some- 
times called FEKO Lite) is available at no cost. See Appendix F for contact 
details. 


* Many CEM codes have an entirely graphical geometry-enter process; although attractive for first-time users 
this does not offer the same fine control as the current FEKO approach, but this may well change in future 
releases of the code. FEKO does of course provide visual feedback of the geometry which has been created as 
soon as PREFEKO is run. Some, such as MWS, discussed in Chapter 3, offer both. 
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Historical note — other thin-wire codes 


MININEC is another program which one quite frequently sees mentioned. The 
name is slightly misleading, since it implies that it is a stripped-down version of 
NEC-2 — this was indeed the original intent of authors Rockway and Logan when 
the project was first mooted in 1980. However, it evolved into an entirely separate 
implementation, using a different formulation, and different basis functions (in 
the current version, triangular ones). See Appendix F for contact details. 

Wire (also known as Thin Wire) was a program originally developed by Rich- 
mond at Ohio State University; it still has a loyal following there and versions 
have been made publicly available. See [4, Appendix F] for more details of the 
code in its 1989 incarnation, WIRE89, with a FORTRAN listing. 


5.2 An introductory example: the dipole 


No matter what numerical technique has been used - MoM, FDTD, FEM - one of 
the first things to check is that the solution is indeed converged. What we mean by 
this is that, after a certain point, refining the mesh (making segment size smaller, 
for a simple MoM problem) does not change the solution. (In Chapter 3, Sec- 
tion 3.2.7, we saw how making A smaller improved the quality of the solution by 
comparison to the analytical result.) To investigate this we will study the half- 
wavelength dipole. A note is in order here: this term can cause confusion for 
newcomers in antenna engineering, since what is usually meant is the wavelength 
at which the dipole exhibits its first resonance — i.e. has no reactive part of the 
impedance. This is usually equivalent to the wavelength at which the reflection 
coefficient is minimized in a typical 50 Q or 75 Q system, since the real part of 
the input impedance is generally on the order of 50-70 © and changes far less 
rapidly than the reactance at resonance. It generally occurs at somewhere between 
0.464 ~ 0.491, depending on the dipole thickness. 


Modelling hints — convergence studies using FEKO 


In FEKO, there is unfortunately no simple way to undertake a convergence study 
by creating multiple structures in one file, and one needs to change the discretiza- 
tion manually in the PREFEKO file, run PREFEKO again, and also of course 
re-run FEKO.“ There are various ways of proceeding from here, but probably 
the easiest is to save the output file (out) after each run with a distinctive name, 
and then use the Import -Select File option in the FEKO graphical post- 
processor to read the data in from each file. 


“Tt is possible to do this with OptFEKO, the FEKO optimizer, but this is beyond the scope of the present 
discussion. 
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Figure 5.2 Results of convergence study for a dipole of length 0.5 m, radius 0.005 m. A1 
feed model. 


The result of such a convergence study is shown in Fig. 5.2. The default refer- 
ence impedance of 50 Q was used to create these plots.> All produce a minimum 
reflection coefficient of around —15 dB except for the coarsest mesh (—14 dB); in- 
terpolating a little, the frequency of this varies from 292 MHz (5 segments) through 
281 MHz (11 segments) and 278 MHz (21 segments) to 276 MHz (41 segments). 
The five segment model has a segment size of just under 4/10, which is about the 
largest segment length which should be used in thin-wire modelling, certainly near 
a source. FEKO will issue a warning or an error if the segmentation is grossly inad- 
equate. NEC, however, does not — many of the preprocessors now available provide 
this functionality, another reason that it is strongly recommended to use one! 

The obvious course is now to proceed with further refining of the mesh (81 seg- 
ments, etc.) but for subtle theoretical reasons, this is not wise. The problem is that 
the FEKO solution is based on the thin-wire approximation, discussed in Chap- 
ter 4. With a large number of segments, each segment becomes very short, and 
although the wire overall may indeed be thin, this is no longer true for a particular 
segment. FEKO issues a warning if the ratio of segment length to radius is less than 
around 3.3, and an error if this is less than 1. (The developers of NEC suggest an 
even more conservative ratio of around 8 as a preferred lower bound [1, p. 4].) In- 
deed, our 41 segment model actually violated this, with a ratio of 2.5. If one opens 


5 FEKO offers the ability to load sources — this is not the same as setting the reference impedance Zg. 
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the output file and views the warnings, one will observe that a warning was indeed 
issued with the 41 segment model. (FEKO computes the ratio as radius to segment 
length, so the values reported in the file are the inverse of those in this discussion.) 

The difference in resonant frequencies between the 21 and 41 segmentation runs 
is under 1%. It is important to note that resonant frequencies predicted numerically 
are often in error, typically by some few percent; indeed, this is perhaps the least 
accurate physical parameter computed by the MoM (and other numerical meth- 
ods). This is especially true of thin-wire structures, but is generally true of resonant 
devices. To illustrate this further, we also show a result computed using NEC-2 in 
Fig. 5.2. NEC-2 predicts a center frequency of around 273 MHz using 41 seg- 
ments, as opposed to the 276 MHz of the corresponding FEKO computation, an 
error also on the order of around 1%. NEC-2 uses different basis functions and 
a collocation approach, whereas FEKO uses piecewise linear basis functions and 
the Galerkin formulation, so one cannot expect the NEC-2 and FEKO results to be 
identical. To improve this further, one will need a more sophisticated source model 
for both codes ® and one should be aware that this is about the level of accuracy 
for this parameter which can be expected from standard thin-wire codes. 

FEKO offers other methods for driving dipoles, and it is worth looking at them 
briefly. The Al model essentially replaces a segment with a region of impressed 
electric field. It is important to note that this is done within the code! 


Modelling hints — feed points for wire antennas 


Many new users of MoM codes — FEKO, NEC-2 etc. — try to create a dipole from 
two wires, with a gap in the middle for the feed. This is incorrect! The correct 
approach is to specify a feed on an existing segment. In the region of the feed, 
the current is of course displacement current, rather than conduction current; it 
is effectively the former which the MoM is approximating in the feed region, but 
it still needs a segment (even though it is fictitious) and its associated expansion 
function in order to do this. 


The other feed models for thin-wire structures offered by FEKO are the A2 
and A3 models. The former uses a very thin gap between two nodes. The latter 
models a coaxial feed; it is derived by considering the TEM fields in a coaxial cable 
feeding a monopole against a very large ground plane, as discussed in Section 4.3. 
In Fig. 5.3, the results obtained by applying these three different feed models to 


® One such approach uses a quasi-static MoM model first to establish the incident field, which is usually assumed 
in such MoM models, and then uses this in the full-wave solution. One also needs to treat end-caps carefully. 
The best source on this is [5], whose results were also supported by careful measurements. 
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Figure 5.3 Comparison of different sources using 20 or 21 segments: voltage gap on seg- 
ment (A1); voltage gap at node between segments (A2); magnetic frill feed (A3). 


this dipole are shown. Twenty-one segments were used for the Al and A3 sources, 
and 20 for the A2 source. (Because the A2 source models a feed at a node rather 
than on a segment, the model requires an even number of segments for this case 
in order to place the feed at the dipole center.) For the A3 source, an equivalent 
inner and outer radius must be specified; usually, the former is chosen as the wire 
radius, and hence the latter is 2.3 times this for a 50 Q system. This was used 
to produce the results shown in Fig. 5.3. For this example, excellent agreement 
between the various feed models is observed, which is very gratifying. However, 
for other problems, one or other model may be far easier to use, hence the provision 
of different models. 

A final comment on convergence testing. For complex models, in particular 
ones using geometrical data imported from other programs, checking convergence 
may be very difficult. This example gives some guidelines for the type of errors 
one should expect. Our coarsest mesh (A © 4/10) produced an error of around 
5.5% (with respect to the finest mesh). Refining the mesh to A © 1/20 more than 
halved the error to around 2%. Refining the mesh again to A © 4/40 once again 
halved the error. However, the actual values of the errors will vary from problem 
to problem, and we caution that if it is not possible to use a quite fine mesh (i.e. 
small segment size of A © 4/20) one needs to be very careful indeed in accepting 
results generated using any MoM program. In the examples to follow, we will 
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generally use quite fine meshes satisifying at least this criterion, and will not ex- 
plicitly remark again on convergence, but it should always be kept in mind. 


Code tips — structural versus control cards in NEC 


NEC differentiates between two different types of cards, namely structural and 
control cards. The former define actual metallic segments and patches, either via 
the direct creation of a wire or surface, or via operations on structural elements 
such as copying or reflection. The latter control parameters such as the location 
of the excitation, operating frequencies, grounds, near- and far-fields requested 
etc. 

Note that a NEC file requires at least one card which triggers execution, such 
as a field computation. The XQ card is a convenient way of forcing execution 
otherwise. 

FEKO also distinguishes cards in a similar fashion, using the terms geometry 
and control cards respectively. 


5.3 A wire antenna array: the Yagi-Uda antenna 


In the preceding section, we discussed how to specify feed models, as well as the 
importance of checking that the analysis has converged. However, the thin-wire 
half-wavelength dipole is not a very stimulating engineering design on its own. A 
much more interesting example is an array of dipoles. Two well-known examples 
here are the Yagi-Uda antenna’ and the log-periodic antenna, invented at the Uni- 
versity of Illinois Urbana-Champaign during the 1950s. Design tables are available 
for both antennas, and some are reproduced in [6, 7]. The main difference is that 
the former is a narrowband, moderately high-gain structure, but with only one ele- 
ment (the driven element) fed; the latter is a wideband structure, somewhat lower 
in gain, with all the elements fed in parallel via a transmission line network. Both 
are very widely used for VHF and UHF communication, as well as TV reception 
from terrestrial broadcasts. (Satellite transmissions are in the microwave band and 
a high-gain dish is generally used.) As an example, we will analyze a simple Yagi— 
Uda array, with one reflector, one driven element and four director elements. This 
is illustrated in Fig. 5.4. 

We use the design data of Viezbicke, available in [6, Section 5.4] or in [7, Sec- 
tion 10.3.3]. Viezbicke’s design process usually consists of two stages: firstly, 
establish the director and reflector lengths for the prototype Yagi [6, Table 5.4]; 


7 §. Uda is credited with the original design in 1926; the first English language publication was by his professor, 
H. Yagi, in 1927 [6, p. 188]. 
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Table 5.1 Design data for a six-element Yagi array, wire radius 
a = 0.00425, using Viezbicke’s results 


Element Length (in wavelengths) Spacing (in wavelengths) 
Reflector 0.482 —0.2 
Driven 0.475 N/A 
D, 0.428 0.25 
D2 0.420 0.25 
D3 0.420 0.25 
D4 0.428 0.25 


Spacing is relative to the previous element. 


0.24 0.25 2 0.25 A 0.25 A 0.25 A 


Reflector Driven D, D2 D3 D4 


Figure 5.4 The six-element Yagi array described in the text. 


secondly, compensate for the actual wire radius using [6, Fig. 5-37]. By using 
the wire diameter d = 2a = 0.0085A of the prototype given in [6, Table 5.4], no 
compensation is required. These tables do not give the length of the driven ele- 
ment; this is usually the resonant dipole length in free space [6, p. 190]. (This 
can be established from standard results, for instance [6, Table 5.2]: for L © 0.5A, 
L/2a © 59, the required shortening is about 5%, i.e. 0.475 A.) Hence our design 
is as summarized in Table 5.1. Director 1 is closest to the driven element. Extracts 
from the FEKO .pre file are given in Fig. 5.5; a NEC-2 data file is shown in 
Fig. 5.6. 

Results for the reflection coefficient and the H-plane pattern at 291 MHz (the 
actual resonant frequency) are given in Figs. 5.7 and 5.8 respectively. The simula- 
tion indicates around a 5% —10dB impedance bandwidth (the range of frequencies 
for which |S ;| is less than —10 dB, corresponding to VSWR < 2), which is as ex- 
pected for a thin-wire structure. (These results were obtained for a segment length 
of around 49/40 at the center frequency.) The resonant frequency is 291 MHz, 
some 3% lower than the design frequency. Since quite fine segmentation has been 
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freq_o = 300.0e6 ** centre frequency in Hertz 
lam_o = #c0/#freq_o ** wavelength in metre, #c0O = speed of light in vacuum 
rf_len = 0.482*#lam_o ** Reflector 


dr_len = 0.475*#lam_o ** driven element 
di_len = 0.428*#lam_o 
d2_len = 0.420*#lam_o 
d3_len = 0.420*#lam_o 
d4_len = 0.428*#lam_o 


SR 0.2*#lam_o 


sD 0.25*#lam_o 


diam = 0.0085*#lam_o 


num_seg=21 


delta=#dr_len/#num_seg 


** Parameters for segmentation 
IP #diam/2 #delta 

** Geometry of radiating structure 

DP ri n -#S_R 0 -#rf_len/2 
DP rf_p -#S_R 0 #rf_len/ 2 
BL rf_n rf_p 

DP drin 0) 0) -#dr_len/2 
DP dr_p ) 0 #dr_len/ 2 
BL dr_n dr_p 

DP din 1*#S_D ) -#d1_len/2 
DP dlp 1*#S_D ) #d1_len/2 
BL di_n dlp 
DP d2n 2*#S_D 0 -#d2_len/2 
DP d2p 2*#S_D 0 #d2_len/2 
BL d2_n d2_p 
DP d3n 3*#S_D 0 -#d3_len/2 
DP d3_p 3*#S_D ) #d3_len/2 
BL d3_n d3_p 


DP d4n 4*#S D 0 -#d4_len/2 


DP 6d4_p 4*#S D 0 #d4_len/2 


BL d4_n d4p 
** End of geometric input 


EG 1 0) 0 0 0 


Figure 5.5 Part of a PREFEKO file for the six-element Yagi array illustrating the use of 
user-defined variables and scaling. 
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CM 6 element Yagi 


CE Start of geometry 


The application of FEKO and NEC-2 


GW1,21,-0.200000,0.000000,-0.241000,-0.200000,0.000000,0.241000,0.00425 


GW2,21,0.000000,0.000000,-0. 


GW3,19,0.250000,0. 


GW4,19,0.500000,0. 


GW5,19,0.750000,0. 


GW6,19,1.000000,0. 


GE 


FR 


EX 


XQ 


EN 


0 


0 


0) 
Sik 0 
2 ae 


000000,-0. 


000000,-0. 


000000,-0. 


000000,-0. 


0 275.00000 1.0000000 


00 


1.00000 


0.00000 


237500,0.000000,0.000000,0.237500,0.00425 


214000,0.250000,0.000000,0.214000,0.00425 


210000,0.500000,0.000000,0.210000,0.00425 


210000,0.750000,0.000000,0.210000,0.00425 


214000,1.000000,0.000000,0.214000,0.00425 


Figure 5.6 A NEC-? file for the six-element Yagi array. This file uses the comma-delimited 


format. 


275 


280 


285 


290 


295 


300 305 310 315 320 325 
Freq [MHz] 


Figure 5.7 Reflection coefficient of the six-element Yagi array. 


used, this is probably a real effect, and were one to build this antenna, all the 
dimensions should be scaled by a factor of 0.97 to obtain a resonant frequency 
of 300 MHz. The peak directivity is just over 11 dBi (i.e. referred to an isotropic 
radiator). Viezbicke’s tables indicated a gain of 10.2 dBd (referred to a half-wave 
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Gain [dBi] 


Figure 5.8 H-plane pattern of the six-element Yagi array at its resonant frequency. 


dipole), which is equivalent to 12.35 dBi. The reason for the difference is that 
the directivity quoted here has been computed at the resonant frequency, whereas 
the peak gain is achieved at around 305 MHz, and is indeed about 12.3 dBi. From 
Fig. 5.8, the front-to-back ratio (the difference between the radiation in the forward 
and rear directions) is around 10 dB; Viezbicke’s tables indicated around 19 dB, 
but again, the comparison is at a different frequency. Note that gain and directivity 
are not synonymous in antenna engineering, but since our antenna is lossless, we 
can use the terms interchangably here. 

Also shown on Fig. 5.7 are the results of a NEC-2 simulation, run with a similar 
segmentation. The NEC-2 data file is shown in Fig. 5.6. The NEC-2 results show a 
yet lower resonant frequency of about 287 MHz, some 1.4% lower than the FEKO 
results. As we commented in the previous example, this is about as accurate a 
result as one can expect with two different MoM codes using relatively basic feed 
models. Interestingly, both simulations show another very narrow quasi-resonance 
just above the design frequency. 


Code tip — using Wiregrid for Windows 


This very useful NEC-2 preprocessor and postprocessor is available free, and 
is very largely self-explanatory, with on-line documentation, but here are a few 
useful tips which can otherwise cause frustration. 
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e The program has a function which permits one to see the actual wire radius visually. 
However, this deactivates most of the editing functions and needs to be switched off 
before proceeding further. 

e There is an extremely useful function which forces an odd number of segments on all 
wires. 

e Finally, after graphing etc., the NEC output file must be Released before one can 
edit the model again. 


Figure 5.8 also shows the NEC radiation pattern predictions (the NEC results 
are computed at 287 MHz, the resonant frequency computed by NEC); we use 
these to illustrate an important point, namely the far-field radiation patterns are not 
as sensitive a parameter as the input impedance, and hence excellent agreement 
with other codes can usually be expected. (Agreement with measurements tends to 
be less satisfactory; frequently, the problem lies with the experimental setup, for 
instance problems with the feed cables interfering with the patterns.) 

We did not explicitly perform a convergence check, since we are using a fine 
discretization with around 40 segments per wavelength, but of course the com- 
ments in our introductory dipole section apply. Due to the relatively thick dipoles 
in use, one cannot refine the mesh further without starting to violate the thin-wire 
assumptions. 

Aside from the lower center frequency — which as we commented above, is 
easily fixed in practice (or indeed in simulation) by scaling — our six-element Yagi 
array works moderately satisfactorily. Now, we are in a position to evaluate quickly 
the effect of having to use a different wire radius etc., as is quite probable in an ac- 
tual design. This however might degrade the performance of the antenna. We might 
also not be satisfied with the front-to-back ratio, for instance, and wish to improve 
this. This leads into the field of optimization, which FEKO supports, although we 
will not pursue this further here. 


Modelling hints — using user-defined variables and scaling 


When developing a general-purpose model, it is often useful to specify dimen- 
sions in terms of 49, which makes it very easy to change the operating frequency. 
Also, all the dimensions are given in terms of user-defined variables, so that if 
we want to change the design of the antenna (perhaps by optimization), we have 
already done a lot of the work. An example of this is shown in Fig. 5.5, which 
shows part of the PREFEKO file exploiting user-defined variables. Some other 
commercial codes, such as MWS, have similar abilities. Connected to this is 
scaling: a popular use of this is to permit microwave structures to be entered in 
millimeters. Whilst NEC-2 does support scaling, it does not support user-defined 
variables. 
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Modelling hints — wire radius versus diameter 


Here is an important point to note, which even experienced users forget from 
time to time: wire thicknesses in FEKO and NEC-2 are specified in terms of 
radius, whereas especially older texts in antenna design often use diameter. Ac- 
cidentally confusing these is a common source of error; to make things worse, 
the simulation will often still appear to work, but the results produced are usually 
subtly incorrect. 


5.4 A log-periodic antenna 


The Yagi-Uda example highlighted a number of points, but in a sense was simply 
an extension of the dipole problem, since the additional wires — the reflector and 
the directors — were passive, and it was just a case of adding these into the . pre 
file. The problem we will now investigate, however, brings some new points, with 
regard to both FEKO modelling and antenna engineering. It also serves as an in- 
troduction to some ideas in wideband antennas. 

The log-periodic (log-p) antenna consists of a number of wire dipoles, but unlike 
the Yagi-Uda antenna, they are all fed (by means of a transmission line, which 
provides a parallel feed). Also, each element is smaller than and more closely 
spaced to its predecessor; the ratio is constant, and T is the design parameter which 
specifies this. With dipole lengths L, and spacing dy, this is defined as: 

Ln+1 dn+1 


_ _ 5.1 
ata i re (5.1) 


The other parameter which defines a log-periodic array is the spacing factor o, 
defined as 


dy 
o= 
2b 


(5.2) 


One can also compute a, the angle of the wedge bounding the dipole arms of the 
log-p, from these parameters: 


l1-—Tt 
a = 2 arctan ( ) (5.3) 
4o 


A value of t close to 1 indicates a log-p with a very slow expansion, i.e. long 
overall length, but also higher gain. The design of a log-p is typically a trade-off 
between length, gain and impedance match. Most design data are based on tables 
originally published by Carrel in 1961; subsequent research has improved these 
tables and a typical set are presented in [6, Section 6.7]. We will base our FEKO 
simulation on [6, Example 6.2]. 
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Table 5.2 Design data for a nine-element log-periodic array 


Element Length (in meters) Spacing to next element (in meters) 

1 2.78 0.828 
2 2.29 0.682 
3 1.88 0.560 
4 1.54 0.459 
5 1.27 0.378 
6 1.04 0.310 
ei 0.858 0.256 
8 0.705 0.210 
9 0.579 — 


Zioad —, . fe ]elele L Feeding end 


g 9 
6 


- 2 


1 


Figure 5.9 The nine-element log-periodic array described in the text. The details of the 
crossed feed are only shown for the largest three elements, but repeat to the end of the 
array. Also shown is the feeding end, as well as the position for a possible terminating 
load, as discussed in the text. 


To summarize this briefly for readers without ready access to this reference, 
the design specification is for a 6.5 dB gain antenna over the VHF-TV and FM 
broadcast bands, which span the frequency range 54-216 MHz (a 4:1 bandwidth). 
From the design tables, tT = 0.822 and o = 0.149 are selected to satisfy the gain 
requirement. The lowest frequency determines the length of the longest element, 
usually chosen as Amax/2, or 2.78 m in this case. Elements are then placed until an 
element shorter than Amin /2 is produced. In this case, nine elements are required. 
The tabulated data are for a dipole radius 1/250 of the dipole length, clearly varying 
from element to element. The characteristic impedance of the transmission line is 
100 &2. The design is summarized in Table 5.2 and illustrated in Fig. 5.9. 
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To implement this in FEKO, there are several approaches that can be taken. The 
first is simply to create nine wires. A better approach is to use the ! ! FOR 
! !NEXT loop structure, as illustrated in Fig. 5.10. It will also be noted that we 
construct the elements from four points: two at each end, but also two very close to 
the center. We do the latter for two reasons. Firstly, there is then always a segment 
at the center of the element to feed, no matter what the segment length. Secondly, 
we use the label (LA) card (the equivalent of a tag in NEC) to attach a unique 
label to these central segments; this makes connecting these fed segments (which 
represent the terminals of the elements) via a transmission line much easier. This 
is the next step to consider. 


Modelling hints — using iteration loops and conditional execution structures 
in PREFEKO 


Many antennas consist of repeated components, and PREFEKO has a very useful 


feature to implement this, namely the !!FOR ... !!NEXT loop (iteration) 
structure. This is illustrated in Fig. 5.10. We have used another useful feature 
as well, namely the !!IF ... !!THEN ... !!ELSE conditional. Note 


that d,, is computed from the current length, and is computed before we update 
(reduce) the length for the next execution of the loop. 

NEC-2 has no such functionality — the closest NEC-2 gets is the coordinate 
transformation GM card, which allows one to copy, translate or rotate parts of the 
geometry. 


We also have to consider how to interconnect the radiating elements. The obvi- 
ous way is to connect wires to the elements to form a transmission line explicitly. 
However, this is not a very efficient way of handling the problem. Transmission 
lines are non-radiating structures, and can be succinctly described using two-port 
circuit theory. FEKO incorporates this feature, implemented using the TL card. 
(This functionality is also available within NEC-2, with the same name.) We need 
eight of these transmission lines; a subtle design point is that the transmission lines 
are crossed, i.e. reverse phase, from element to element; this is done to compress 
the overall length of the antenna. (In NEC, such crossed lines are specified by us- 
ing a negative characteristic impedance.) These are also implemented using a loop. 
Finally, the transmission lines of log-periodic antennas are often terminated with 
a resistive load (usually equal to the transmission line characteristic impedance, 
100 2 in this case) to improve the impedance match. This is done here via the spe- 
cial handling of the last transmission line, which adds a shunt (parallel) admittance 
of 1/100 S to the feed segment of the last antenna. 
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** Analysis of a 9-element logarithmic periodic antenna. 


** Some definitions for the geometry 


sigma = 0.149 ** scaling factor for spacing [eqn.6.83,S&T] 
tau = 0.822 ** scaling factor for elements [eqn.6.85,S&T] 
len = 2.78 ** length of element (initially L_1) 


rad = #len/250.0 ** radius of first element: L/2a = 125 
Zline = 100 ** transmission line impedance 


Zload = 100 ** load impedance at the last element (set to very large value 
if not present) 


mum =9 ** number of elements 


** Frequency specification and segmentation 


freq _min = 50.0e6 ** start frequency 
freq_max = 250.0e6 ** stop frequency 
lambda_min = #c0/#freq_max ** minimum 


seglen = #lambda_min / 20 


P #seglen 
** Initial values for the loop 


!!FOR #1 = 1 to #num 


!lIF (#1 = 1) THEN 

aK This is the first element to be created, at origin 
#x = 0 

!! ELSE 


aK Other elements spaced logarithmically 


#xX = #x+#d 

!! ENDIF 

ae Create the wire with the correct radius, use a unique 
ex abel #i for the centre segment 

z = #seglen ** ensure that just one segment at the centre 
DP P1 #X 0 -#len/2.0 
DP P2 #x 0) -#2z/2.0 

DP P3 #x 0) #z/2.0 

DP P4 #X 0 #len/2.0 
LA 0 


Figure 5.10 PREFEKO file for the nine-element log-periodic array. 
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BL Pl P2 #rad 
LA #1 
BL P2 P3 #rad 
LA 0 
BL B3 P4 #rad 


** Compute inter-element spacing to next element. Note that d_n is the spacing 
between elements 


** Lon and L_n+1 and must be computed using current length. 


te 
Q 
i] 


2.0*#sigma*#len 
baal Now apply scaling for next element (shorter) 


#len = #len*#tau 


te 
ra 
0 
a 
iT 


#rad*#tau 


** End of the geometry 


EG al 0) 0 0 0 


** Create all the transmission lines (again a loop is very useful) 
!!FOR #i = 1 to #num-1 

baal Extra shunt admittance at the first element 

!lIF #i=1 THEN 


#YS = 1 / #Zload 


!!ELSE 

#YS = 0 

!! ENDIF 

ak Define the transmission line from label #i to label #i+1 (crossed) 
TL 1 #i #1i+1 1 al #Zline #YS 

!!NEXT 


** Excitation by a voltage source at the last (shortest) element 
FR 2 #£req_min #£req_max 

Al 0 #num 1 0 

** Vertical radiation pattern - gain 

FF al 1 A. 1 90 0) 

** Vertical radiation pattern - directivity 

FF al: A. a 0 90 0) 


EN 


Figure 5.10 (Continued) 


164 The application of FEKO and NEC-2 


CM 9 element log-p 

CE Start of geometry 

GW1,47,0.000000,0.000000, -1.390000,0.000000,0.000000,1.390000,0.01110 
GW2,39,0.828400,0.000000, -1.142600,0.828400,0.000000,1.142600,0.00910 
GW3,33,1.509400,0.000000, -0.939200,1.509400,0.000000,0.939200,0.00750 
GW4,27,2.069200,0.000000, -0.772000,2.069200,0.000000,0.772000,0.00620 
GW5,23,2.529300,0.000000, -0.634600,2.529300,0.000000,0.634600,0.00510 
GW6,19,2.907500,0.000000, -0.521600,2.907500,0.000000,0.521600,0.00420 
GW7,15,3.218400,0.000000, -0.428800,3.218400,0.000000,0.428800,0.00340 
GW8,13,3.474000,0.000000, -0.352500,3.474000,0.000000,0.352500,0.00280 


GW9,11,3.684100,0.000000, -0.289700,3.684100,0.000000,0.289700,0.00230 


GE 0 0 

PT - 

PL 3 1 ie) 1 

TL 24 2 20 -100.0000 0.00000 0.01000 0.00000 0.00000 0.00000 
TL 2 20 3 17 -100.0000 0.00000 0.00000 0.00000 0.00000 0.00000 
TL 3 AL? 4 14 -100.0000 0.00000 0.00000 0.00000 0.00000 0.00000 
TL 4 14 5 12 -100.0000 0.00000 0.00000 0.00000 0.00000 0.00000 
TL 5 12 6 10 -100.0000 0.00000 0.00000 0.00000 0.00000 0.00000 
TL 6 10 Ww; 8 -100.0000 0.00000 0.00000 0.00000 0.00000 0.00000 
TL 7 8 8 7 -100.0000 0.00000 0.00000 0.00000 0.00000 0.00000 
TL 8 7 9 6 -100.0000 0.00000 0.00000 0.00000 0.00000 0.00000 
EX 0 9 6 00 1.00000 0.00000 

FR O 101 ie) 0 50.00000 2.0000000 

RP 0 a: 1 1000 90.00000 0.00000 0.00000 0.00000 0.00000 0 
EN 


Figure 5.11 NEC file for the nine-element log-periodic array. 


In NEC, the absence of user-defined variables, loops etc. means that we have no 
option other than to compute the values explicitly and enter them by hand, either 
into a NEC file directly, or using a preprocessor. An example of a NEC file for this 
log-periodic array is given in Fig. 5.11. 


Code tip — some useful NEC functions 


In Fig. 5.11, two cards PT and PL are used which offer useful functionality. The 
former is used for selectively or entirely suppressing outputting of the currents, 
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which is, perhaps unfortunately, the NEC default, since this otherwise inflates 
output files with data which are rarely used. The latter produces an extra data file 
(the specific name varies from implementation to implementation) with radiation 
patterns or currents suitable for later plotting. 


Results for the reflection coefficient and the gain of the log-periodic array 
are given in Figs. 5.12 and 5.13 respectively. Results computed with NEC-2 us- 
ing a similar segment length are given for some of the parameters, and excel- 
lent agreement is noted. Also indicated on Fig. 5.12 is the reflection coefficient 
level corresponding to a VSWR of 2, widely used as a specification for antenna 
impedance. (A VSWR < 2 actually corresponds to |$,; < —9.54| dB, as indi- 
cated, but |$,;; < —10| dB is often used instead for convenience.) It will be noted 
how the use of the terminating resistance improves the impedance match; the an- 
tenna has |$1;| < —10 dB over almost the entire band in this case. Without the 
terminating resistance, the reflection coefficient varies far more over the frequency 
band, sometimes lower, but also sometimes unacceptably high. Another point to 
note is that the log-p array must be fed from the shorter end; if fed from the 
longer end, the long dipoles are excited (but not very effectively) so that there 
is too little power at the higher frequencies to radiate properly from the shorter 
dipoles. 


I 
—— Terminated (FEKO) 
: — — Unterminated (FEKO) 
-2- -» Terminated (NEC) H 


S,, [0B] 


Freq [MHz] 


Figure 5.12 Reflection coefficient of the nine-element log-periodic antenna in the text, for 
both resistively terminated and unterminated cases. 
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[dBi] 


1F —- Gain (FEKO) 4 
Gain (NEC) 

— Directivity (FEKO) 

0 | | I 

50 100 150 200 250 

Frequency [MHz] 


Figure 5.13 A comparison of gain and directivity for the nine-element resistively termi- 
nated log-periodic antenna. The gains computed by FEKO and NEC lie essentially on top 
of one another. 


On Fig. 5.13, both gain and directivity (also sometimes known as directive gain) 
are given. To revise these terms briefly, the former indicates how well the antenna 
focusses power spatially, relative to the power delivered to it; the latter indicates 
how well the antenna focusses power spatially, relative to the power radiated by it. 
Clearly, if the antenna has any loss, the two will not be identical, and the difference 
on Fig. 5.13 is due to the losses in the termination. We have traded off a better 
impedance match for a slightly poorer gain. (At the very top of the band, we are 
slightly under the 6.5 dB gain design specification. To improve this, we would 
have to repeat the design using a longer array, i.e. with more elements, but we 
will leave this as an exercise.) A final point: because the transmission line has a 
characteristic impedance of 100 , it is tempting to use this as the impedance level 
when computing 5); etc. However, one should recall that this line is in parallel with 
the radiating dipole(s), with an impedance of typically 50 ~ 70 Q. The net result 
is that this antenna is quite well matched to a 50 Q system, which is the FEKO 
default. Note also that we only compute the gain at one angle, in the direction 
along the axis of the antenna. A log-p is an end-fire antenna, and radiates in the 
direction from longest to shortest element. 

This example also introduces another feature which FEKO supports, namely the 
use of adaptive frequency sampling (this is not supported by NEC). This example 
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is sufficiently complex (227 wire segments as discretized) that FEKO takes a no- 
ticeable amount of time, typically a second or two, to compute each frequency 
point. However, the data change rapidly over frequency, requiring a /ot of points; 
to obtain good results with uniformly spaced frequency points over the frequency 
band of interest, one would need at least 100 points, preferably more. FEKO has 
the ability to determine where to place frequency points in a non-uniform fash- 
ion, as well as then intelligently interpolating the data by using what is termed a 
model based parameter representation. We use the defaults for this option in the 
frequency card in the PREFEKO file in Fig. 5.10. 


Modelling hints — gains in dB or actual values 


Be very careful when plotting gains for these relatively low-gain structures; the 
gains in dB or in actual value are quite similar numerically, and it is easy to plot 
the wrong dataset, especially when exporting data! 


5.5 An axial mode helix antenna 


Helical antennas are another interesting type of antenna. The axial mode helix was 
invented by Kraus at Ohio State University in 1946, and his textbook on anten- 
nas is a mine of information on the subject [8]. Their bandwidth ratio is given 
theoretically as approximately 1.78 [6, Section 6.2.2]. (A wideband antenna is 
conventionally defined as one where this ratio exceeds 2, so the helix is close to 
being “wideband.”) Details are also available in [7, Section 10.3.1]. It is also a wire 
structure, but unlike all the previous antennas we have analyzed, which all relied 
on a standing wave on some part of the structure, this is a traveling wave antenna, 
at least in its axial mode of operation, which is the most common mode of employ- 
ment. The circumference of the antenna is chosen such that currents on opposite 
sides of the antenna (which would radiate fields out of phase due to the winding 
of the helix, if the currents were in phase) are delayed by a half-wavelength, so 
that the resulting radiation is now in phase again along the axis of the helix (and 
hence the name, axial mode). The radiation is circularly polarized, with the sense 
of the winding, i.e. a right(left)-hand wound helix generates right(left)-hand po- 
larization. Compared to other candidates, the axial mode helix is quite compact — 
the helical structure permits a lot of wire to be contained in a moderately small 
volume — and the design is very popular in the UHF band, especially for satel- 
lite communication. (A closely related structure, namely the normal mode helix, is 
very popular for mobile telephones. It radiates almost isotropically.) 

FEKO provides the HE card, which greatly simplifies creation of a helix. Indeed, 
all that is required other than this card is to add a short segment below the helix to 
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feed it with, and to add a ground plane of some type underneath it. A ground plane 
of around 0.75, on each side is usually adequate [6, Section 6.2.2]. 

To create a ground plane, one can use a mesh of wires — and indeed this was a 
very widely used method with NEC-2. However, FEKO supports the creation and 
meshing of surfaces. A simple method of defining a surface is using the parallelo- 
gram card (BP). This surface is then meshed using triangles. 


Modelling hints — connecting wires and plates 


Here is something to be careful of. The obvious approach when grounding the 
helical wire is to generate one surface in the plane z = 0, where the feed seg- 
ment terminates. However, this usually will not work properly! The reason is 
that FEKO, and indeed any MoM code, needs the nodes defining the segments 
on the wire and the triangular segments on the surface to coincide. Many new 
users overlook this and it is a frequently encountered fault. In the PREFEKO file, 
we have generated only a quarter of the ground; this of course includes a point at 
the origin, where the feed segment connects. We then use geometrical symmetry 
in two planes (x = 0 and y = 0) first to create half the ground plane, and then 
to create the entire ground plane. (The PREFEKO file supplied does this in first 
the x = 0 plane, then the y = 0 plane, but the order is actually irrelevant in this 
example.) A recent addition to FEKO permits users to specify internal nodes 
in polygonal plates, using the PM card, which makes it easier to make sure that 
wires correctly connect to nodes on surfaces. 


One final point regarding creating the geometry. FEKO also offers a ground 
plane (BO) card, and this would appear to be very useful. However, one needs to 
read the “fine print” in this case. This card uses a reflection coefficient approxi- 
mation, i.e. the fields radiated by the structure are imaged in the ground plane, but 
the ground plane is not taken into account when the currents are computed by the 
MoM. As such, it is very useful for antennas some distance above a ground plane, 
where the currents are indeed hardly changed by the presence of the ground, but 
entirely inappropriate for an antenna fed right against a ground, as the helix is. A 
careful reading of the user manual cautions that segments should not connect to the 
ground, but does not describe in detail why this ground plane would be incorrect 
in this application. 

A detailed design example is given in [6, Example 6.2]. The antenna is to operate 
in the microwave band, with center frequency 8 GHz. The circumference of the 
antenna, C, is specified as 0.92A = 34.5 mm. (It will be noted that the scaling 
card is also used in the PreFEKO file to permit all dimensions to be entered in 
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millimeters, which is far more convenient than meters in this frequency range.) 
The pitch angle @ is chosen as 13° (a value based on prior design experience). 
The spacing between turns, S$, works out at 7.96 mm, and the antenna has N = 10 
turns. With the 1.78 bandwidth ratio and center frequency of 8 GHz, the lower and 
upper frequencies are 5.75 GHz and 10.25 GHz respectively. The PREFEKO file 
is given in Fig. 5.14, and Fig. 5.15 shows a FEKO model of the antenna. 

Radiation patterns at the lower, center and upper frequencies are shown in 
Fig 5.16 with a ground plane 1.5A on a side, somewhat larger than the minimum 
recommended. The gain at 8 GHz is exactly 13 dBi, somewhat higher than the 
10.5 dBi gain predicted by the approximate formula [6, Eq. (6-34)] 


CVs 
G62 (<) N= (5.4) 
d d 


Commensurate with this increased gain, the half-power (HP) beamwidth of 40° is 
somewhat smaller than that predicted by the approximate formula [6, Eq. (6—33)] 


65° 


(5.5) 


of 48°. It must be emphasized that these are approximate empirical formulas, so 
some differences are to be expected. Kraus provides another formula [8, Eq. (7), 
p. 235] for directivity, which he describes as more realistic: 


2 

D# 12 (S) n> (5.6) 
Xr A 

Using this formula yields a gain of around 13.3 dBi, almost exactly as simulated. 

(Since the antenna is essentially lossless, we are again using gain and directivity 

interchangeably.) 

From Fig. 5.16, the gain at the lower frequency is almost 3 dB less than at the 
center frequency, and the pattern is starting to show some “squint”; the main beam 
has moved slightly to the left. At the upper frequency band, the gain has increased 
and the main beam has narrowed (which may or may not be acceptable, depending 
on the design requirements). 

Impedance results are shown in Fig. 5.17. (These data were generated using 
adaptive frequency sampling.) It will be noted that the antenna is largely resistive 
across most of the frequency band. However, towards the lower end of the band, 
the otherwise smooth impedance curves break down. This type of behavior is not 
predicted by the simple description of operation as a traveling wave antenna [6]. 
Measured data by Baker [8, Fig. 8-73], who worked on helix arrays with Kraus, 
indicate almost exactly the same impedance behavior at around 0.7 of the center 
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** A 10-turn helical antenna 


Variabl 


scaling 


kK 


freq Be 


freq_min= 
freq_max= 


lam #c0 


lam_mm 


circum 


h_rad = # 


h_len 7 


gnd 135 


** Paramet 


seg rad 


seg_len 


tri_len 


Pp 


Quarter 
G1 
G2 


G3 


Go U U YU 


G4 


Gl 


Generat 


al: 


ZERO 


LA 


BL 


Optional scaling factor 


coordinates etc. 


es 
(set to 0.001 for geomet 
defined in cm instead of metres 


0.001 


Frequency and wavelength 


9 ** frequency in Hertz 


5.75e9 


10.2569 


Kk 


/#fireq wavelength in metre, #c0 


#lam/#scaling 


kk 


34.5 helix cicumference 


circum/ (2*#pi) 


9. kk 


6 helix length 
*#1lam_mm 
ers for segmentation 


#lam_mm/100 ** radius of the wire segments 


rical data 


) 


speed of light in vacuum 


#lam_mm/20 ** maximum length of wire segments 
#lam_mm/10 ** maximum size of triangles 
#seg rad #tri_len #seg_len 
of ground plane 
0.0 0.0 0.0 
#gnd/2 0.0 0.0 
#gnd/2 #gnd/2 0.0 
0.0 #gnd/2 0.0 
G2 G3 G4 
e rest of ground - imaged first in x=0, then y=0 planes. 
1 0 0 
0 1 0 
0.0 0.0 0.0 
0.0 0.0 2*#seg len 
0 0.0 #h_len 
#h_rad 0.0 2*#seg_len 
Bl Cl 0 10 
ZERO 


Figure 5.14 PREFEKO file for the 


10 turn helix. 
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** Apply the scaling factor 

SF 1 #scaling 

** End of geometric input 

EG al 0 0 0 0 

** Voltage gap excitation at segment just above ground 


Al 0 al 10 


** Note: using adaptive frequency sampling permits only 

** ONE of the following analysis options: 

** ** Set the frequency card for adaptive frequency sampling. 
** FR 2 #£req_min #£req_max 
kk 

** ** Trigger execution, no patterns. 

** FF 0 

kk 

** Set discrete frequency for radiation patterns. 

FR 3 0) #£req_min #£req_max 


** Radiation pattern 


FF ab = a 0 -90 0 1.0 
FF 1 L810 1 0 -90 90 L430; 
** End 

EN 


Figure 5.14 (Continued) 


frequency, with the reflection coefficient suddenly increasing dramatically from 
less than —20 dB to —2 dB or worse over a very small frequency change. (Baker’s 
helix was not precisely the same as the one simulated here, hence the frequency 
at which this effect occurs is slightly different.) The reason is that the axial mode 
ceases effective operation quite abruptly; [8, Fig. 8-34] provides more information 
on this, in particular via the phase velocity. 

In the region near the design frequency, the resistance and reactance values are 
well behaved, as shown in Fig. 5.18. An approximate formula for the input resis- 
tance of the axial mode helix is 


R& 140C/AQ (5.7) 


At 8 GHz, this gives a value of ~129 Q, whereas FEKO indicates a value closer 
to 170 Q. It should be noted that the above formula is to be regarded only as 
an approximation, so the FEKO result is very credible. Also giving confidence in 
the FEKO results is the approximately linear increase in resistance, at least in the 
central part of the frequency band. In practice, such an antenna would probably be 
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Zz 


Figure 5.15 The FEKO model of the axial mode helix antenna discussed in the text. 


fed via an impedance matching transformer, probably with a 3:1 ratio. As such, 
a reference impedance of Zo = 150 Q is appropriate when plotting the reflection 
coefficient, which is shown in Fig. 5.19. 

To evaluate this antenna fully for an actual design exercise, one should also 
check the axial ratio of the polarization, since this is an important parameter when 
designing circularly polarized antennas. This information is also available in the 
.out file, but may require some manipulation to present graphically. More details 
are available in [8]. 

In summary, the helix performs well from around 6.2 GHz to at least 10.75 GHz, 
in terms of impedance match ($1; less than —10 dB, assuming a 3:1 impedance 
transformer for a 50 (2 system, as above) and offering reasonable pattern behavior. 
This is a bandwidth ratio of 1.73. The gain at the center frequency agrees very 
well with Kraus’s improved formula, and at the lower end of operation, the re- 
flection coefficient shows the same behavior as measured data for a similar (but 
not identical) helix. The empirical design formulas give reasonable guidelines for 
gain and half-power beamwidth, but the numerical simulation provides much more 
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Figure 5.16 The gain of the axial mode helix antenna at the lower, center and upper ends 
of its operating band. 
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Figure 5.17 Resistance (R) and reactance (X) of the helix antenna across the entire oper- 
ating band. 
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Freq [GHz] 


Figure 5.18 Resistance (R) and reactance (X) of the helix antenna near the design 
frequency. 


Freq [GHz] 


Figure 5.19 Reflection coefficient of the helix. 
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accurate data. In an actual design, the helix as simulated may be acceptable for the 
application; if not, one at least is aware that a redesign is likely to be advisable, 
without even the need first to build a prototype. 


Code tips — modelling this structure in NEC 


Later versions of NEC-2 included a GH card, which permits one to specify a 
helix or spiral with the same ease as the FEKO model discussed in this section. 
However, modelling grounds in NEC is more problematic. One is tempted to use 
the SP card, which generates a surface patch model. However, this uses the mag- 
netic field integral equation, which as we will see in Chapter 6 is not suitable for 
modelling an open structure. Instead, a ground plane will have to be built from 
a wire mesh, either by hand (there is no automatic means to do this in NEC) 
or using the Wiregrid for Windows package, which supports this functionality.” 
Wiregrid approximations of surfaces were studied in detail by Ludwig [9], who 
confirmed using a careful analysis that the long-used “equal area rule” produced 
a good approximation. This rule requires that the surface area of the wires paral- 
lel to one linear polarization when “rolled flat’ should equal the surface area of 
the solid surface. (For an arbitrary polarization, the wire surface area should be 
doubled.) One quickly see that this implies that segment length A ~* 2zra, with 
a the wire radius, which is pushing the limits of the thin-wire approximation. 
Also, we repeat our earlier warning: one must be very careful to ensure that the 
helix wire and wires representing the ground plane actually connect. 


“FEKO includes a WG card to do this, although due to its surface meshing capabilities, one will probably not 
use this too often. 


5.6 A Wu-King loaded dipole 


Thus far, all the antennas discussed in this chapter were assumed to consist of 
perfectly conducting wires. (The log-periodic antenna included a terminating re- 
sistance, which was introduced to improve the impedance match, although the ele- 
ments were still assumed to be perfectly conducting.) In practice, the conductance 
of the metals traditionally used for constructing wire antennas (aluminum, steel 
etc.) is sufficiently high that this is an excellent assumption. In this example, how- 
ever, we are going to study an antenna deliberately loaded with resistance — the 
Wu-King resistively loaded dipole. This antenna, first described in [10, 11], has a 
continuous resistive loading. In practice, this can be made either using thin tubular 
sections of varying radius and material [10], or by approximating the continuous 
loading by discrete resistors [12]. 
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Wu and King showed that if the loading on a dipole, half-length h, had the 
following form:® 


ZZ) = i [Q2/m] (5.8) 
2rh(1 — |z|/h) , 
then the current had the following approximate form: 
IGA = laine or (5.9) 


This is clearly a traveling wave. By comparison, on the usual half-wave resonant 
PEC dipole, the current has the standing wave form sin[k(h — |z|)]. 

The dimensionless parameter VW is complex valued, and a function of the elec- 
trical dimensions of the antenna. It is usually approximated by its DC value, Wo. 
It must be computed numerically; typical values are from just under 10 for moder- 
ately thick dipoles to around 20 for very thin ones. 

We will study the loaded dipole described by Maloney and Smith [12]; for their 
antenna, the ratio of half-height to radius h/a was 65.8, and Vo = 7.79. For conve- 
nience, we will work with h = 0.25 m, so that the unloaded PEC dipole resonates 
close to 300 MHz. 

In FEKO, loading can be accomplished using several different cards: LD, LS 
and LP. The first implements distributed loading, in &2/m, which is what we need 
here. (The other two cards implement lumped loads in series and parallel respec- 
tively.) FEKO loads segments via their label number, and hence one needs to la- 
bel each segment on the dipole separately. (A FEKO label is the equivalent of 
a NEC tag.) One way of doing this is shown in Figs. 5.20 and 5.21, where the 
dipole is first built from individual segments, and then loading is applied to each of 
these. 

The reflection coefficient of the Wu—King dipole is compared to a PEC (un- 
loaded) one of the same dimensions in Fig. 5.22. In these results, two values 
of loading are shown: the “high” value is as in Eq. (5.8), the “low” value is 
as given in their original paper, with an 8 instead of 2 in the denominator. The 
Wu-King dipole has a rather high input impedance (given approximately by 
60%), so Zo = 300 2 was used when computing the reflection coefficient (for 
the PEC dipole, Zp = 75 Q was used). The loaded dipole clearly has a much larger 
impedance bandwidth, and is indeed a wideband antenna. The rather poor result 
for the higher loading is due to a large, but slowly varying, reactive component, 
as shown in Fig. 5.23; this could be removed by adding a tuning component in an 
actual application, but this has not been done here. Figure 5.24 shows the current 
distributions along both loaded and unloaded dipoles at 280 MHz, the resonant 


8 Note the major corrections in [11]; the corresponding expression [12, Eq. (1)] is correct. 
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** AR resistively loaded (Wu-King profile) dipole. 


** As in Maloney and Smith, 


** Variables 


IEEE T-AP, May 1993 p.668-676. 


ae Frequency and wavelength 


f_1 = 200e6 
£u = 600e6 


eta_0 = sqrt (#mu0/#eps0) 


P 


num_sg = 2*#num_sg2+1 


delta = 2*#h/#num_sg 


num_sg2 = ceil (#h/#seg_1n) 


lam = 1.00 ** wavelength in metre 
freq = #c0/#lam ** frequency in Hertz 
h = #lam/4 ** half-height of antenna [m] 


seg_rad = #h/65.8 ** radius of wire 


psi_O = 7.79 ** Wu-King parameter 


** Parameters for segmentation 


seg In = #lam/40 ** nominal length of wire segments 


#seg rad #seg_ ln 


** segments on each dipole half 


(excl. 


source) 


** to ensure odd number of overall segments 


** actual length of wire segments 


** Geometry of radiating structure 


** Has to be constructed with two loops and a special source segment, 


separate label 


** is required for each segment 


** Construct centre (source) 


lab = #num_sg 


LA #lab 
DP A 
DP B 


BL A B 

** Construct upper half 
elll = #delta/2 

ell2 = #delta/2+#delta 


!!for #ii = 1 to #num_sg2 


lab = #ii 
LA #lab 
DP A 
DP B 


BL A B 
elll = #e111+#delta 


ell2 = #e112+#delta 


segment 
0.0 0.0 -#delta/2 
0.0 0.0 #delta/2 
0.0 0.0 #elll 
0.0 0.0 #e112 


since a 


Figure 5.20 PREFEKO file for the Wu—King loaded dipole, geometry. 
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** Construct lower half 
elll = -#delta/2 

ell2 = -#delta/2-#delta 
!!for #ii = 1 to #num_sg2 


lab = #ii+#num_sg2 


LA #lab 
DP A 0.0 0.0 #elll1 
DP B 0.0 0.0 #ell12 


BL A B 


elll = #e111-#delta 
ell2 = #e112-#delta 
!!next 


** End of geometric input 


EG 1 0 0 0 0) 


Figure 5.20 (Continued) 


** Load the structure - again, a loop structure is used. 
** Load source segment 

load = #psi_0*#eta_0/(8*#pi*#h) 

lab = #num_sg 

LD lab #load ** Loss 
** Upper half and lower half at same time now: 
!!for #ii = 1 to #num_sg2 

Zz = #ii*#delta 

load = #psi_O*#eta_0/(8*#pi*#h* (1-#z/#h) ) 

lab = #ii 

LD lab #load ** Loss 


lab = #ii+#num_sg2 


LD lab #load ** Loss 


!!next 


** Set the frequency 

FR 41 0 #£_1 #f£_u 
** Voltage gap excitation at a segment 

#lab = #num_sg 

Al 0 #lab 1.0 

** Calculate surface currents for current display 

Os 1 AD 

EN 


Figure 5.21 PREFEKO file for the Wu—King loaded dipole, loading. 
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Figure 5.22 The reflection coefficient of the Wu—King dipole compared to a PEC dipole. 
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Figure 5.23 The impedance of the Wu-King dipole, “high” loading. 


frequency of the unloaded dipole. (The magnitudes have been normalized; the 
higher impedance of the loaded dipole results of course in smaller values of cur- 
rent.) The loaded dipole with the higher loading clearly supports a traveling wave, 
with a phase difference along the dipole arm of a little more than the 90° predicted 


Normalized magnitude 


Phase [deg] 
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Figure 5.24 The current (normalized magnitude and phase) on the Wu—King and PEC 
dipoles. 


by Eq. (5.9) for h © Ao/4, and with an almost linear current distribution, also as 
predicted. The phase for the unloaded dipole is almost constant, as one would 
expect from a standing wave distribution. The results for the lower loading are 
somewhere in between the pure standing wave of the unloaded dipole and the pure 
traveling wave of the dipole with higher loading. 

The wide bandwidth is, however, bought at a price: efficiency. Wu and King 
originally predicted a theoretical efficiency of 50% for h = 49/4, but FEKO shows 
a much lower efficiency of around 7% at 300 MHz (Fig. 5.25). In a subsequent 
correction [11], Wu and King drastically revised their calculation, predicting a 
very similar value to the FEKO computation. The result for the lower loading 
case is around 23%, rather better. Interestingly and serendipitously, the original 
(incorrect) result by Wu and King provides generally better antenna performance, 
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Figure 5.25 The efficiency of the Wu—King dipole. 


certainly in terms of reflection coefficient and efficiency, even though the current 
is not a pure traveling wave. 

Unfortunately, the Wu—King loaded dipole is one structure for which measured 
data are very scarce, and hence we have had to evaluate this model in terms of 
expected theoretical behavior. Useful measured data were published in [12], but in 
the time domain. (Although FEKO has a time domain option, it is only available 
for scattering problems.) 

Before leaving this structure, a fundamental point should be noted about wide- 
band antennas. The definition of this is inherently a frequency domain concept, 
and one should be careful to differentiate between a wideband antenna on the one 
hand, and a non-dispersive antenna on the other. The former type of antenna works 
well over a wide range of operating frequencies; the latter can radiate actual time 
domain pulses without distortion (obviously, it will also be wideband). A little 
thought about this from the viewpoint of the Fourier transform shows that this 
translates to requirements on not just constant magnitude response, but also phase 
linearity. Many wideband antennas (such as spirals and log-periodics) are disper- 
sive because different frequencies radiate from different parts of the structure. We 
will not pursue this further, but will mention in closing that the loaded dipole ex- 
hibits limited dispersion, and because of this is widely used in time domain antenna 
systems despite its low efficiency. 
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Code tips — modelling this structure in NEC 


The LD card provides the same functionality as in FEKO, but the absence of 
user-defined variables in NEC means that one will have to compute the load- 
ing manually at each segment, so this will be a tedious structure to model in 
NEC. 


5.7 Conclusions 


In this chapter, we have discussed modelling thin-wire antennas using FEKO 
and NEC-2. Starting with a very simple dipole example, we progressed to more 
complex antennas, including Yagi-Uda and log-periodic dipole antennas, an ax- 
ial mode helix and a loaded dipole. The helix example also introduced the use of 
surface modelling. We highlighted a number of points which one must be care- 
ful with; perhaps the most crucial is to check that the solution is converged (but 
also not over-converged, due to the limits of the thin-wire approximation). We 
also emphasized the importance of validation, that is, checking computed results 
in some way. Historically, comparison to measured data or an analytical solution 
has been the most convincing method of validation. Nowadays, comparisons with 
data computed using other codes and/or formulations are increasingly widely used 
and accepted, and we have directly compared FEKO and NEC-2 results on sev- 
eral occasions, noting that one cannot expect exact agreement. (It has also been 
commented in this context that measured data must also be used with discretion.) 
A number of features supported by FEKO (but not NEC-2) which simplify an- 
tenna modelling were introduced, including iteration and conditional execution. 
Several other FEKO and NEC-2 features were also discussed, including the use 
of labels/tags, transmission lines, and various types of grounds. We also took the 
opportunity afforded by numerical simulation to improve an antenna design, by 
adding a terminating resistance to a log-periodic antenna and evaluating the change 
in antenna performance. 

Properly used, within its region of validity, we have seen that the thin-wire for- 
mulation is both accurate and very efficient computationally. Having completed 
this chapter, the reader should feel far more confident in modelling a wide range 
of wire antennas using tools such as FEKO and NEC-2. 

During this chapter, we very briefly touched on the modelling of surfaces in 
Section 5.5. This is an important part of many antenna designs — and also for 
scattering problems — and in the next chapter we will comprehensively discuss 
the modelling of surfaces and volumes using the MoM, as well as the attendant 
problems of high computational cost. 
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The method of moments for surface modelling 


The helix antenna discussed in the previous chapter used a new type of element to 
model surfaces. The theory underlying this is described in this chapter. Not only is 
the basic theory quite complex, but implementations are especially challenging, so 
we focus largely on an introductory discussion, followed by some examples of us- 
ing available codes, rather than going into the frequently lengthy details of full 3D 
implementations. We will see that not only can perfectly (or highly) conducting 
structures be efficiently modelled using surface currents, but also homogeneous 
dielectric and/or magnetic regions, using fictitious equivalent currents. (We will 
even briefly describe how inhomogeneous bodies can be modelled using volumet- 
ric currents, but note at the outset that this is not one of the strong points of the 
MoM.) Modelling surfaces is far more computationally expensive than modelling 
wires, and some methods for reducing the computational cost will also be dis- 
cussed. These include a hybrid of the MoM and physical optics, and the general 
class of fast methods, including both those based on the FFT and the fast multipole 
method. We will also briefly touch on the use of parallel processing. 


6.1 Electric and magnetic field integral equations 


Following the same lines as the Pocklington equation (Chapter 4), integral equa- 
tions in either the magnetic or electric fields can be derived for problems with 
currents flowing on surfaces. The derivation is quite complex, and only the results 
will be presented here. One integral equation couples the incident electric field to 
the induced surface current, and is known as the electric field integral equation 
(EFIE): 


ax E™@)-Ax i [ ikns@IGE79) 
S 
e AY, ise V'6G.7 | dS’, W#?’eS G1) 
j 
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The V’ operator implies differentiation in the source coordinates. 7 is the unit 
vector on the surface S. G(r, 7’) is the scalar free-space Green function given by 


g—kR 
GG?) = 6.2 
(er ) taR (6.2) 
R=|F—-7"| (6.3) 


Equation (6.1) is valid for both closed and open surfaces. In the latter case, Js is 
the sum of surface currents on both sides of the sheet. 

The other integral equation couples the incident magnetic field to the induced 
surface current, and is known as the magnetic field integral equation (MFIE): 


1 ? 3 A baat = 
Is) =n iH (7) 
4+nix f Is) x VG, F dS", Vr,F'e€S (64) 
S 


This is valid only for closed surfaces. (The reason is not by any means straight- 
forward, and emerges during the derivation thereof.) It is interesting to note that if 
we neglect the surface integral, what remains is the physical optics approximation, 
Js(F) = 24 x H'"°(F), of which more later. 

The integrals in the above should be interpreted as the principal value of the 
integral. (The principal value of an integral with a singularity at ro is essentially 
the value of the integral with a 5 neighborhood around 79 removed; then the limit 
as 6 — 0 is found.) In both these equations, the presence of singularities raises 
delicate issues and requires careful treatment. The simple expedient of slightly 
offsetting field and source points as was done with the one-dimensional wire prob- 
lem (in that case, by treating the source as a filament on the wire axis, but still 
imposing the boundary condition on on the surface of the wire) can still be done, 
although in this case one offsets the quadrature points corresponding to source and 
field points rather than concentrating the source elsewhere. 

Mathematically, the EFIE is a Fredholm integral equation of the first kind — the 
unknown is present only in the kernel. The MFIE is a Fredholm integral equation of 
the second kind — the unknown is present both inside and outside the kernel. The 
reason for the difference is due to the boundary condition. The EFIE and MFIE 
are both derived from the Statton—Chu formula, which states that for points on the 
surface, the following relations hold [1, p. 172]: 


EF) + PV i} E,dS' = —E(?) 
Ss 


NLR NI Re 


H<(7) + PV / H,dS' = —H(7) (6.5) 
S 
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The PV here reminds us that these are the principal values of the relevant integrals. 
In the equations above, EF, and H, are not directly the scattered fields, but rather 
the kernels which are integrated to obtain these (the full expressions may be found 
in [1, p. 173]). For a PEC, the boundary condition on WX E, i.e. tangential FE, is 
of course zero, whereas fi X H is the surface current J,; s; hence the different nature 
of the two integral equations. For more details, see [1, Section 12.3]. 

Mathematically, it is well known that Fredholm type two integral equations are 
generally more well posed — this motivated much work using the MFIE. (Put sim- 
ply, a well-posed problem is one whose solution is not strongly dependent on the 
physics and geometry of the problem.) However, the requirement for a closed sur- 
face S is frequently a problem in applied CEM work, with the result that the EFIE 
is usually preferred in practical codes. Finally, linear combinations of the EFIE and 
MFIE have also been used; not surprisingly, this method is known as the combined 
field integral equation (CFIE). The CFIE will not be discussed here. 

Because the EFIE and MFIE are both quite complex, it is convenient to intro- 
duce a simplifying notation. As an example, for the EFIE, the right-hand side of 
Eq. (6.1), which represents the scattered field, is often written in the following 
shorthand: 


Ey=£51"" 


L£, which represents all the mathematical operations to be performed on the current 
J, is known as an operator — it is an extension of the concept of a function. 


A mathematical aside — functions, functionals and operators 


A function, of course, maps a number (integer, real or complex) to another num- 
ber; a functional maps a function to a number; and an operator maps a function 
to another function. (We will encounter functionals in Chapter 9.) The Fourier 
transform is a commonly encountered example of an operator: it maps a function 
of time to a function of frequency (or more generally, from one domain to the 
corresponding spectral domain). Operator notation will be used subsequently in 
this chapter. 


6.2 The Rao—Wilton—Glisson (RWG) element 


When dealing with surfaces using the MoM, two matters need attention. The first 
is that we need to split the geometry up into small elements. The simplest ap- 
proach, and the first one explored historically in codes such as NEC-2, was to use 
square (or rectangular) patches. However, for general two-dimensional geometries, 
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triangular elements are better for approximating the geometry, and this is the ap- 
proach which most modern codes (including FEKO) use. The second matter is that 
the physical parameter being approximated, J s, is now two dimensional. The basis 
function must also incorporate this. 

In this context, a very widely used basis function for the triangular patch was 
introduced by Rao, Wilton and Glisson in 1982 [2]. The basis function is often 
known simply as the RWG element. Subsequent work led to the realization that 
this basis function is very closely related to the edge-based elements widely used 
in contemporary finite element analysis. We will return to this later in this book 
when we address finite elements. 

The basis function includes some new features which have not yet been en- 
countered in this book. Most importantly, the basis function is vector in nature, 
which means that the individual scalar components (eg J,, J, and Jz), can only 
be recovered with some manipulation. The essential idea is to enforce current con- 
tinuity over an edge of a patch. The interpolation function used to achieve this is 
the following: 


pt Wein Tt 


2 2A; 
fn®) =) 5% p, WF in T,, 
0 otherwise (6.6) 


Figure 6.1 defines the vectors p,* and p, . Note that the basis function is defined 
over two adjoining triangles T* and T,~ which share a common edge. A; is the 
area of triangle T,* (and similarly A;). /, is the length of the shared edge. The 
vector p," is the vector position within triangle T;*, with the left-hand node of T,* 
as origin; similarly, 6, is the (negative of the) vector position, with the right-hand 
node of 7, as origin. There exists a coordinate system known as simplex coordi- 
nates which makes the study of interpolation functions on triangles much simpler, 


Edge 1 


Edge 3 


Figure 6.1 The two connected triangles 7," and 7,~, sharing a common edge, which sup- 
port a RWG basis function. 
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Figure 6.2 A vector plot of the RWG basis functions. 


and is widely used in finite element analysis; these vectors can be written rather 
simply in that coordinate system. The terms /,/2A;7 and /,/2A;* are normalizing 
constants. 

The resulting current interpolation is shown in Fig. 6.2. The following points 
may not be immediately apparent. Firstly, it should be noted that this basis func- 
tion has no component normal to the upper or lower sides of either of the triangles, 
but only to the central (shared) edge. Without more detailed theoretical analysis, 
the following is not obvious, and is stated without proof here:! the current crossing 
this shared edge is linearly interpolated in the tangential direction (i.e. along the 
edge) and interpolated as a constant normal to (i.e. across) the edge. This latter 
value is usually the “degree of freedom” (the unknown value of current) which is 
associated with this basis function; the current associated with this edge is thus ap- 
proximated as Jn (7) © In fr (r). Note that all these terms are expressed in terms of 
the local coordinates on the triangle; again, the conversion to Cartesian coordinates 
is readily performed using simplex coordinates. 

What of the current flowing across the two other edges? To approximate these, 
one defines additional basis functions on each of the other two connected triangles; 
thus on any one triangle, there are three such basis functions, with three associated 
unknowns, which are the normal components of current on each edge. Within the 
element, it should be appreciated that the total current is thus approximated by the 
sum of these three basis functions. With the edges numbered as on Fig. 6.1, the 
total current on triangle T,* is given by: 


IF) XHA® + bAF) + hh), VF in T+ (6.7) 


At the risk of repetition, note that the basis functions carry the vector informa- 
tion; the unknowns for which the code solves (11, /2 etc.) are just scalars. 


! Again, because this RWG basis function is so intimately related to the edge-based Whitney function of finite 
element analysis, we postpone detailed mathematical analysis of this class of element until Chapter 9. 
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Some MoM formulations use what is called the “mixed potential integral equa- 
tion” — MPIE. In this case, charge is also present as an unknown. From the conti- 
nuity equation,” this implies that charge will be thus be approximated as piecewise 
constant. We will see an example of the MPIE in Chapter 7. 

With the integral equations posed, and suitable basis functions defined, one is 
now in a position to solve problems involving surfaces using the MoM. We will 
not attempt to implement such a code, since this is a complex task, but rather 
move directly to the study of a problem using a commercial package. The problem 
we choose is one of the classic canonical problems of electromagnetic analysis, 
namely scattering from a PEC sphere in the resonance regime. 


6.3 Some examples of surface modelling 
6.3.1 Scattering from a sphere 


One of the classical problems of analytical electromagnetics was that of scatter- 
ing from a sphere. Early work on this was done in the nineteenth century by 
Lord Rayleigh (John William Strutt, 1842-1919), who has lent his name to the 
general field of scattering from electrically small objects. For electrically small 
spheres, Lord Rayleigh showed that scattering was proportional to the fourth power 
of frequency; this permitted him to explain the color of the sky. For electrically 
large spheres, the scattering cross-section is simply the cross-sectional area of the 
sphere. In between these extremes, the resonant regime is encountered, where en- 
ergy creeping around the surface of the sphere results in constructive and destruc- 
tive interference. The process of electromagnetic scattering will be recalled from 
Chapter 3. 


A brief historical aside — why is the sky (usually) blue? 


The color of the sky is due to the presence of the earth’s atmosphere. On the 
moon or in space, the sky appears black. For our present purposes, we can view 
the atmosphere as consisting of a large number of small particles and molecules 
in suspension. These are considerably smaller than the wavelength of visible 
light (approximately 400 to 700 nm), so that the scattering from each particle is 
proportional to 1/44, as in the text. Hence, the scattering from the violet (short- 
wavelength) end of the spectrum is almost an order of magnitude larger than 
that from the red (long-wavelength) end. The spectral irradiance of sunlight — 
see for example [3, Fig. 7.49] — which peaks near the wavelength of blue light, 


2 The time rate of change of charge is the negative of the divergence of the current. 
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470 nm, and varies by about 30% over the visible spectrum, makes the overall 
calculation slightly more complex. It is this scattered radiation which colors the 
sky blue. (It is worth noting in passing that the scattered light is also polarized, 
although we will not pursue this here.) At sunset, however, the radiation has to 
pass through much more of the atmosphere, and the blue scatters out completely, 
leaving the red sunset. When there is dust in the air, this exacerbates the effect, 
leading to spectacular sunsets. More details of this may be found in many texts, 
such as [4, Chapter 12] and [3, Chapter 7]. The latter has a particularly insightful 
discussion, and also provides extensive historical background on this topic. 


The echo width of a three-dimensional target is also known as its radar cross- 
section (RCS). It is usually abbreviated o. The RCS is defined as follows: 


; | Escat |? 
| Eine |2 


o(6,¢, f) = lim 47R (6.8) 
Roo 

R is the distance to the target. The dimensions of the RCS are square meters, since 

it is in essence an equivalent area. Frequently, results are given in dB form, and 

quite often normalized to 1 m?, in which case the symbol dBsm is often used. The 

RCS of a target is in general a function of orientation and frequency, and this has 

been explicitly indicated above. Note that this definition is entirely equivalent to 


scat 


o(6,¢, f) = jim 4 R° (6.9) 


Pine 
The RCS is a far-field parameter; once the surface currents are have been found 
using the MoM, the radiated fields may be computed in a straightforward fashion 
using standard antenna theory. 

As a simple example of a scattering problem, we will now study the RCS of 
a sphere. A highly conducting sphere with a radius of 5 cm will be chosen; this 
is the typical size of anti-personnel landmines (although of course these are gen- 
erally buried, and also unfortunately usually made largely of non-metallic mate- 
rials to make detection even more difficult). We expect the interesting resonance 
interactions to occur when the circumference of the sphere is of the order of a 
wavelength, hence A * 2a. This corresponds to a frequency of around 1 GHz. 
Running the simulation from 300 MHz to 6 GHz should produce some interesting 
results. 

Note that we are only going to investigate the back-scatter from the sphere; 
hence, only one RCS angle is required (the same one the field is incident from). 
The results of the analysis are shown in Fig. 6.3. The RCS has been normalized 
by the high-frequency limit za? to illustrate more clearly the different scattering 
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2.57 


Figure 6.3 Normalized RCS of a PEC sphere plotted against circumference in wavelengths 
ka. 


regimes; note how the RCS initially climbs steeply (this is the Rayleigh scattering 
regime), then oscillates sharply through the resonance regime, before finally con- 
verging to the high-frequency limit. The horizontal axis has also been normalized, 
by plotting ka = rg (the sphere circumference in wavelengths). Note the peak 
as expected at ka = 1. Also shown on this plot is the exact analytical solution, 
computed as a sum of spherical Hankel functions [5, Eq. (11-247), p. 657] — more 
on this shortly. When compared with the exact solution, we note that the accuracy 
with which the resonances are computed decreases as the frequency increases. 

If we were to analyze this problem over a rather larger frequency band, we 
would find that eventually, the result should converge to the high-frequency limit. 
We cannot do this with the present file, because our discretization will not be suf- 
ficiently fine for frequencies much beyond 6 GHz. Refining the discretization will 
result in far longer execution times. However, some thought about the problem 
shows that we can use symmetry to generate a more efficient solution. The inci- 
dent electric field is x polarized, traveling in the —zZ direction. As such, there is 
a plane of electric symmetry in the plane x = 0. Similarly, there is a a plane of 
magnetic symmetry in the plane y = 0. Finally, there is a plane of geometrical 
symmetry in the plane z = 0. (In this last plane, the geometry is symmetrical, but 
not the excitation.) Results for a wider frequency range computed using symmetry 
are shown in Fig. 6.4. Note the improvement in the resonances when compared to 
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Figure 6.4 Normalized RCS of a PEC sphere plotted against circumference in wavelengths 
ka. Results were computed exploiting symmetry. 


Fig. 6.3. However, at the high-frequency end, the mesh is too coarse even with this 
model, as is clear by comparison with the analytical solution. 


Modelling hints — modelling spheres 


All meshers generate some approximation of the actual spherical surface; in the 
case of FEKO, the triangular mesh is inscribed within the sphere. (FEKO pro- 
vides the KU card to generate a spherical section or a sphere, which makes mod- 
elling the sphere very straightforward.) The model can be improved by using a 
slightly larger radius, chosen to provide the same surface area as the sphere. Con- 
veniently, FEKO computes the surface area of the triangles; for the first model, 
the area was 0.03096 m2, whereas the surface of a 0.05 m radius sphere should be 
0.03142 m?. Increasing the radius by 1.007, the square root of this ratio, should 
provide a slightly better model. 


A couple of closing comments on this study would be in order. Firstly, because 
a sphere is rotationally symmetric, we could have used a field incident from any 
angle. The choice of the x-polarized field, traveling in the —Z direction, was how- 
ever convenient. Note that if a different incident field were used, results would 
(or should) be very similar, but would not be identical, since the mesh is slightly 
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“directional.” Similar comments apply if one compares results computed using a 
sphere created using symmetry to results from a sphere created directly in its en- 
tirety. 

Finally, an important point about the physics and engineering of scattering 
should be made. We computed the RCS in only one direction — straight back in 
the direction of the incident field. In applied physics, this parameter is generally 
known as the back-scatter cross-section; in radar engineering, this is called the 
monostatic RCS, and is the parameter usually used in radar systems analysis. It 
is (normally) the parameter appearing in the radar range equation. Most radars 
are monostatic, which means that they use the same antenna for transmit and re- 
ceive, or at least the transmitter and receiver are located very close to one another. 
As already mentioned, the monostatic RCS of a sphere is not a function of an- 
gle — note that this is the only structure of which this is true! However, there is 
another type of RCS, bistatic RCS. In this case, the transmit and receive anten- 
nas are not in the same location, and the angles of incidence and reflection are 
no longer the same. (Very few bistatic radars have been built, even fewer — if 
any — deployed operationally.) Although the monostatic RCS of a sphere is not 
angle dependent, the bistatic RCS is. The bistatic RCS can also be computed effi- 
ciently using MoM codes, since it requires only a different excitation vector to be 
computed. 


6.3.2 The analytical solution 


The exact solution of scattering from a PEC sphere, plotted in Figs. 6.3 and 6.4, is 
one of the classic analytical solutions in electromagnetics, dating back to the turn 
of the previous century. Nonetheless, despite the venerable status of the solution, 
there are some points which are worth making about it, and indeed about analytical 
solutions in general. 


A brief historical aside — Mie scattering 


The analytical solution for scattering from a PEC sphere was originally derived 
by Mie and published in 1908, and the solution bears his name to this day. 
Debye undertook a very similar study, published in 1909. For details, see [6, 
p. 415]; for elegant sketches of the fields for the first four modes, reproduced 
from Mie’s paper, see [6, p. 567]. Stratton’s book is unfortunately difficult to 
obtain nowadays; the derivation may also be found in somewhat more recent 
texts, such as [7, Chapter 6], and a particularly detailed derivation is given in 
[5, Section 11.8]. 
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The monostatic RCS is given by the following expression: 


Co 


Re 3 (—1)"(2n + 1) 


Oo = A / A 
40 | A,” (ka) Hn” (ka) 


(6.10) 


with a the radius of the sphere and k the free-space wavenumber. The function 
HY (ka) is the alternative spherical Hankel function. It is related to the regular 
cylindrical Hankel function by [5, p. 938] 


APM) = [7 5 =H i) (6.11) 


The prime in HY? (ka) indicates differentiation with respect to the argument. 

This would really appear to be a relatively straightforward formula to imple- 
ment. MATLAB provides only the regular cylindrical Hankel function, but the scal- 
ing required by Eq. (6.11) is very easy to implement. For FORTRAN implemen- 
tations, routines are available in [8, Chapter 6], although one will need to build 
the Hankel function from its constitutive Bessel functions of the first and second 
kinds, viz. H (x) = Jp(x) — jYp(x). The derivative requires some simple ma- 
nipulation to evaluate, using the rule for the differentiation of products applied to 
Eq. (6.11), and the standard identity [5, p. 936] 


d 
—[H? (ax)] = —aH (ax) += ~H (ax) (6.12) 
dx ? 
to obtain: 
te 1 2) Q) n+ 
H,, C= 5i 5 —H, 41/20) + =H, 43/200) + i, 41/2) 
(6.13) 


Hence, Eq. (6.10) can be implemented within a few lines of code. However, one 
needs to be cautious! Routines to compute Bessel functions (by which we include 
Hankel functions) are not bulletproof. In particular, when the argument (ka in this 
case) or the order (n) becomes very large, the results lose accuracy. Good imple- 
mentations should warn of such problems: MATLAB, for instance, provides five 
different error flags, ranging from warnings of possible loss of precision to out- 
right error messages and not returning a numeric value at all. One must check such 
error flags! In the present case, exceeding some hundred terms or so is sufficient 
to trigger error messages. 

Needless to say, the infinite sum in Eq. (6.10) must also be truncated at some 
point. In Fig. 6.5, results are shown for the RCS for the sphere as the maximum 
number of terms is increased; this has been graphed on a semi-logarithmic scale, 
so that the variation is more easily seen. Plotting against ka is especially insightful, 
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Figure 6.5 Convergence of the analytical solution for the RCS of a PEC sphere, as a func- 
tion of the number of terms used. 


since it is clear that the number of terms required is approximately equal to this 
product. (This is not coincidental: these terms correspond to circumferentially 
varying modes, and modes with significantly more rapid variation than ka con- 
tribute primarily to the reactive near-field only.) 

For electrically large spheres, Eq. (6.10) is clearly going to be problematic to 
evaluate directly, and one needs to use asymptotic forms to retain accuracy. 


A philosophical aside — on “exact” analytical solutions 


The above discussion raises a number of interesting points about the nature of 
“exact” analytical solutions. Critics of our present-day reliance on numeric codes 
sometimes forget that even pristine analytical solutions are usually approximate 
in reality, when it comes to evaluating them; such solutions, derived from sep- 
aration of variables and suitable special functions, usually involve infinite sum- 
mations which must in practice be truncated. Furthermore, the evaluation of the 
special functions is almost always done computationally nowadays, and as we 
have commented, this process is by no means always reliable. (Even tables of 
functions are not always error free.) It is perhaps the ultimate irony that the au- 
thor verified his MATLAB implementation of Eq. (6.10) by comparing the results 
to FEKO computations ... 
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Figure 6.6 Love’s form of the equivalence principle. 


6.4 Modelling homogeneous material bodies using equivalent currents 


In the preceding section, we discussed modelling structures consisting of PEC 
(perfect electric conductor) material. The current which the MoM computes in 
this case is the real, physical current, and is what would be measured were one to 
probe the surface current using a loop, for instance. However, there is another in- 
teresting application of surface currents: modelling homogeneous material bodies, 
that is, dielectric (or magnetic) regions. 

All of these rest on the application of the surface equivalence theorem, first 
introduced in 1936 by Schelkunoff. It states that the fields outside an imaginary 
closed surface can be obtained by placing, over the closed surface, suitable electric 
and magnetic current densities that satisfy the boundary conditions. Furthermore, 
the fields inside the surface can be chosen essentially arbitrarily, since the problem 
is only “equivalent” in the exterior region. When this imaginary surface coincides 
with a real surface, interesting physics emerges with specific choices of the internal 
fields. For PEC modelling, the form of the equivalence principle which is generally 
used is Love’s equivalence principle, illustrated in Fig. 6.6 for a general surface. 
With this form, the fields inside the body are assumed zero; since the boundary 
condition at a PEC surface requires that the tangential total electric field be zero 
(and hence also the magnetic surface current), only the electric surface current is 
non-zero and since it is equal to n x (Foot — 0), where Hot is the total magnetic 
field just above the surface, and the 0 represents the internally zeroed fields, it is 
also the actual current. It is also very convenient because since the field has been 
chosen as zero in the internal region, the material in this region can be replaced 
arbitrarily; usually, it is chosen to have the same value as the exterior region, which 


3 The approach can be extended to work for highly conducting structures. 
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is usually free space in antenna problems.* This is very important, since it permits 
the use of the free space Green function — we usually apply this without fully 
discussing the underlying justification. 

In passing, note that there is another variant of this principle which one quite of- 
ten encounters in the theoretical analysis of aperture antennas. In this case, instead 
of replacing the internal region with free space, one uses a PEC body. If this is a 
half-space, one can then use image theory and hence the Green function for free 
space (again) to solve the problem. There are yet other forms which are useful in 
specific circumstances. 

When the material body is an homogeneous dielectric or magnetic structure, 
we can apply the same approach as with the PEC body; there are two differences, 
however. Firstly, the currents are now fictitious (in other words, one would not be 
able to measure them with some cleverly devised experiment), and secondly, both 
electric and magnetic equivalent surface currents are required. 


6.5 Scattering from a dielectric sphere 


Having just discussed a PEC sphere, it is now an interesting exercise to repeat 
the analysis for a dielectric sphere. The model is very similar to the PEC sphere. 
Results are shown in Fig. 6.7. A moderate value, €, = 4, has been chosen for the 
relative permittivity, otherwise the sphere has a very low signature. Results are 
normalized to the asymptotic limit for the PEC sphere, za’. It is interesting that 
the RCS of the dielectric sphere exceeds that of the PEC sphere for ka > 1.5. Both 
unfortunately and surprisingly, there do not appear to be computed data for this 
particular problem of RCS versus ka widely available in the literature, although 
the analytical solution has long been well known. In [9, Fig. 6a], results are given 
for the bistatic scattering from a ka = 3 sphere, and the similarly normalized result 
for 6 = Ois around 25; the FEKO result is a little smaller but in the same region at 
ka ® 3.3. 

To validate this computation, we can use another approach for modelling di- 
electrics available within FEKO, namely equivalent volumetric currents. In this 
case, the entire volume is meshed using cubical cells — this permits the material 
properties to vary from cell to cell, but at much higher computational cost (we will 
discuss this shortly). Results computed using the volumetric approach, as well as 
a surface current model using a slightly finer mesh, are also shown on Fig. 6.7. 
The agreement between the surface and volume formulations is very good up to 
just above ka = 2, which is about the point at which the mesh density drops below 


4 Note that the whole argument also works in reverse for the interior region: in this case, it is the fields in the 
exterior region which are arbitrary. This is not very useful in antenna modelling, however. 
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Figure 6.7 RCS of a dielectric sphere, radius a, €, = 4, compared to a PEC sphere. All 
results are normalized by za’. k is the free-space wavenumber. 


Aa/10, where Aq is the wavelength in the dielectric . (Because we are effectively 
modelling fields in an electrically denser medium, it is the wavelength in the di- 
electric which concerns us.) Since the volume approach meshes the sphere with 
small rectangular cubes, as opposed to a conforming triangular surface mesh, one 
can expect the volume approach to be slightly less accurate geometrically, in par- 
ticular at higher frequencies. This is confirmed by a calculation using a slightly 
smaller edge length for the surface mesh; the agreement between the two surface 
current meshes is good. 

A note of caution here: such intracode validation is usually questionable, but in 
this case, FEKO is using two quite different techniques to compute the RCS, so we 
can place some faith in this result. 

Although the equivalent surface current model is probably the most compu- 
tationally efficient available for general problems,> the requirement to treat both 
the equivalent electric and magnetic surface currents doubles the number of un- 
knowns, and hence quadruples the amount of memory, and increases the run-time 
by between four and eight, depending on the problem size, when compared to a 
PEC sphere. (Eight is the asymptotic limit, for problems with a very large num- 
ber of unknowns where the matrix solution dominates the run-time — we discuss 


5 For the dielectric sphere, the Green function is known analytically, so for this special case only, one could 
develop a faster solver. 
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Table 6.1 Comparison of computational requirements for the PEC versus 
dielectric sphere 


PEC Dielectric (surface) Dielectric (volume) 
Aa/8 Aa/10 Aa/8 Aa/10 
N 663 2 x 663 = 2 x 1008 = 3 x 772 = 3 x 1370 = 
1326 2016 2316 4110 
Memory (Mbyte) 27.6 108 249 329 1033 
Relative 1 4.8 11.5 11.5 N/A 


run-time 


N is the number of unknowns. Run-time is per frequency point. The edge lengths are given 
forka=mT. 


this shortly.) A summary of the computational requirements is given in Table 6.1. 
The run-times are given normalized to the PEC case. Note how execution time in- 
creases by a factor of about five for the dielectric sphere using surface equivalence, 
and more than ten when using the volumetric mesh. (The N/A indicates that the 
problem was too large to run with the available resources.) 

Also shown in this table is the effect of refining the discretization. Changing the 
edge length from 4/8 to 44/10, with the corresponding frequency in this case 
chosen as that corresponding to ka = m (towards the upper end of the frequency 
band), results in an enormous change in computational requirements. Indeed, the 
Aa/10 volumetric discretization was too large for a typical laptop or desktop PC at 
the time of writing, indicated by N/A in the table. 


6.6 Computational implications of surface and volume modelling 
with the MoM 


As has just been seen with the analysis of the sphere, modelling surfaces is far 
more computationally expensive than modelling wires. As already discussed in 
Chapter 4, for a typical wire model the number of unknowns N is linearly related 
to the length of the wire. We will use the product kd to characterize this, with k the 
wavenumber and d the length of wire. There are two time-consuming operations 
required by an MoM code with N unknowns, viz. matrix filling and factoring. 
The former is of O(N’), the latter O(N?) when using direct solvers (iterative 
solvers will be discussed later). However, the constants associated with matrix 
filling can be quite large (that of the matrix solve is close to unity) and in practice 
one often finds that MoM codes are in the pre-asymptotic region as far as timing 
goes, spending more time filling than factoring the matrix. Since N is proportional 
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to kd, we have an asymptotic cost of O([kd]*). To store the interaction matrix 
[Z], N* memory locations are required. Hence the amount of memory required 
is O({kd]*) for wires. These properties are also known as the frequency scaling 
behavior of the algorithm. 

For a surface, although triangles using the vector RWG basis functions are the 
approach generally used in practice, when doing a frequency scaling analysis it 
is easier to consider square patches. To model a surface of size kd x kd, it is 
clear that the number of unknowns will now be M = N x N;; thus the asymptotic 
computational cost is clearly O([{kd]®). (The asymptotic analysis neglects the fact 
that when modelling a surface, one needs to approximate the two components 
of current on each patch — so in practice surface modelling is costly. As for the 
one-dimensional case however, the matrix fill tends to dominate the run-time for 
many problems, with a somewhat lower asymptotic behavior.) In terms of memory, 
the requirements are O({kd]*) for surfaces. 

To give a concrete example, consider doubling the size of ground plane in the 
helix example discussed in Chapter 5; equivalently, double the frequency — the 
product kd expresses this product of wavenumber and size succinctly. The run- 
time will increase by between 2+ = 16 and 2° = 64, and the amount of mem- 
ory required will increase by a factor of 2+ = 16. (This is approximate since the 
helix must also be modelled more finely, but as a wire structure, the frequency 
scaling is somewhat better; however, the requirements of the ground plane in- 
creasingly dominate the considerations.) A factor of 64 is almost precisely the 
difference between minutes and hours and one should appreciate that modelling 
surfaces may require powerful computers and take considerable time. Fortunately, 
there are some methods available to assist in this regard, which we will discuss 
shortly. 

Modelling volumes is even more costly. To model a volume of size kd x kd x 
kd, it is clear that the number of unknowns will now be M = N x N x N;; thus the 
asymptotic computational cost is clearly O([kd]’). (Again, the asymptotic analysis 
neglects the fact that when modelling a volume, one needs to approximate the 
components of current on each cell — now three of them. On the other hand, once 
again the matrix fill tends to dominate the run-time for many problems.) In terms of 
memory, the requirements are O([kd]®) for volumes. We saw these effects clearly 
at work in Table 6.1, where a slight change in edge length for the volumetric case 
meant that we were unable to solve the problem in a reasonable time, or indeed 
even run it all due to memory limitations. 


6.7 Hybrid MoM/asymptotic techniques for large problems 


This section is based on a review paper originally published as [10]. 
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Body replaced by equivalent 
currents in free space 


Material structure not characterized 
by equivalent currents 


Figure 6.8 Wire radiator together with an electromagnetically large scatterer, whose effect 
is taken into account using a hybrid formulation. (After [10], ©1999 SATEE.) 


6.7.1 Introduction 


Any combination of CEM techniques can be termed a hybrid. Here it is conve- 
nient to distinguish between exact and approximate hybrid approaches. In the for- 
mer, also known as MoM/Green’s function hybrids, special Green’s functions are 
used to take the effect of the scatterer in Fig. 6.8 into account implicitly. Although 
very powerful for appropriate problems, the restricted number of special Green’s 
functions available limits the generality of this approach. In the latter case, high- 
frequency methods such as physical optics are used to describe approximately the 
interaction between parts of the structure far removed from one another. 

Probably the best known of the exact hybrids is the Sommerfeld potential treat- 
ment for radiators near, on or within stratified media. This will be discussed subse- 
quently in this book. For slotted waveguide array analyses, the appropriate waveg- 
uide Green function has been widely used in MoM formulations. Another special 
Green function that has been used is that for layered spheres [11]. 

Deriving such Green functions is a formidable task: [5] gives a good introduc- 
tion to the process of deriving a Green function, but for more advanced purposes a 
detailed description of dyadic Green functions may be found in [12]. A review of 
this type of hybrid method may be found in [13]. 

We use the term “exact” hybrid method for this approach since the only ap- 
proximations made involve the conventional MoM discretization of the current on 
the radiator/scatterer. There is some disagreement about the use of the term “hy- 
brid” for the MoM/Green function method; we follow the nomenclature of [13] 
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here. As regards the use of “exact”, we have already commented that many spe- 
cial Green’s functions involve theoretically infinite series expansions or, as we will 
see, pose challenging integration problems in the complex plane, as is the case with 
Sommerfeld potentials. 


6.7.2 Moment method/asymptotic hybrids 


Hybridizations of the MoM with various asymptotic techniques are approximate 
in the sense that in addition to the conventional MoM discretization, assumptions 
are made that are only exact in the high-frequency limit. (The MoM is sometimes 
described as a “numerically exact” formulation, in that the only approximations 
are those required to produce a linear system. This type of hybrid is no longer 
numerically exact — even if the equations could be solved exactly, without any er- 
rors introduced by discretization or numerical evaluation of integrals, the method 
is still approximate.) However, these methods are potentially more generally ap- 
plicable than the MoM/Green function hybrids outlined above and we will now 
review physical optics for this purpose. 


6.7.3 Physical optics and MoM hybridization 


Physical optics (PO) is a well established concept in electromagnetic theory [5, 
Section 7.10]. The essence is that the equivalent surface current on a smooth con- 
ducting surface is given by: 


9 = 2n x Hj (6.14) 


We have already seen in Section 6.1 that this is an approximation of the MFIE. It 
may also be seen as an application of the equivalence principle, with the follow- 
ing approximations for a sufficiently large structure: firstly, H can be replaced by 
2H; (this essentially assumes no end effects); secondly, currents can be “locally” 
imaged (hence the factor 2). Note that unlike a ray-based method, integration over 
the surface current is still required — but the current in the integrand is now known, 
as opposed to the MoM where the current is unknown . 

In terms of hybridization with the MoM, PO has an enormous advantage in 
being current based — most asymptotic methods (UTD etc.) are field based, and 
this leads to a rather natural MoM/PO hybridization process. The essential idea is 
to use the MoM on small, resonant structures, and in regions near edges, and to 
use PO on large, smooth areas. If applied appropriately, smooth “blending” should 
occur between these regions. The overview in this section closely follows the de- 
velopment presented by Jakobus and Landstorfer [14] and retains their notation. 
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The mechanics of hybridization require a brief review of basic MoM theory us- 
ing linear operator notation. The scattered fields are set up by currents on surfaces 
(J™) and wires (IM™): 


Rell Hie 
H, = ci? JM + ceyMM (6.15) 


Le ; ji etc. are linear operator short-hand for the actual integrodifferential opera- 
tors (for example, the EFIE and the MFIE as in Section 6.1). Standard MoM basis 
functions are used: 


NMM 
™ = ye nf, 
n=1 
NMM 
IMM = >" Baan (6.16) 
n=1 


Jakobus and Landstorfer use piecewise linear basis functions for g, and f,; the 
latter are the Rao—Wilton—Glisson triangular vector functions for surfaces as al- 
ready discussed. For a PEC, the standard boundary condition Ejan = 0 is applied, 
resulting in: 


NM 
i E 
—Etan _ > an (Ly fr )tan + 
n=1 
MM 
Nj 


SS Baler eae (6.17) 


n=1 


Either collocation or weighted residuals can be used to solve for the unknown 
coefficients aw, and 6, (in total, N a + a of them). 

Now, in the region of the scatterer not treated by the MoM, the PO surface 
current is approximated using the same surface patch treatment as in the MoM 
region as 


ToS: Ss ah, (6.18) 


n=NVM 41 


with f,, as before and y, coefficients of surface current in the PO region. It is very 
important to note that y, are known (in terms of the a, and 6, coefficients) from 
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the PO approximation, as shown later. Hence they are not obtained by the solution 
of a linear system — thus the matrix size remains N aM + AMM: 
In the PO region, the PO current J?° is given by: 


J(r)P° = 26; -A x Hiv) 


MM 
Nj 


ats YS Vid inh Let 
n=l 


MM 
Ny 


+ >° 2Bndrn- fi x LY Bn (6.19) 
n=1 
dy,n and 67,, account for possible shadowing, with values of +1 or 0; the optical 
basis of the method will be recalled. 
When currents in the PO region are included as well, the equation from the 
boundary condition in the MoM region becomes: 


LEM + CFI + FPO = EP (6.20) 
Note that there are two different PO/MoM coupling mechanisms. 


1. The currents in the MoM region contribute to the PO currents via Eq. (6.19) (via the 
summation terms). 

2. The currents in the PO region in turn contribute to the fields in the MoM region and thus 
impact on the boundary condition represented by Eq. (6.20). 


It might appear that this would require some iterative process for self- 
consistency, but the “feedback” effects can be taken into account in closed form. 
The PO currents can be found in terms of the unknown MoM currents as 


NMM NMM 
Vk = i,k + » On THnjk + y Bn - Tink (6.21) 
n=1 n=1 


with 
tik = (Gf +8,)- Gia x Hi) 
Tink = +8) Oni x LF fy) 
Tink = (&t +8) Orat x LE gn) (6.22) 


As and ie are unit vectors associated with the kth triangle edge; see [14] for further 
details. It is important to note that al/ the terms in the above equation are known, 
being either derived from the geometry of the problem, the discretization or the 
chosen basis function. 
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The central idea here is that these PO currents in terms of the MoM unknowns 
can now be substituted into Eq. (6.20). The final result is the following: 


NMM | NMM 4. yPO | 
- an ° (Cit en + eS Tyn,k * (Citas + 
n=l k=NMM 41 
NMM NMM yyPO 
Spat |e eat Stak Cian 
n=l k=NMM4] 
NMM + NyPO 
=-Ejtn- > tik: (C5 fk )tan (6.23) 
k=NYM+] 


The above equation summarizes the MoM/PO interaction: the effect of the PO 
is to alter the MoM matrix entries. Note that each MoM entry is modified by 
contributions from all the PO currents; this can become computationally expensive 
and can be neglected under certain conditions, usually when the PO and MoM 
regions are physically separated. (An example is a reflector antenna, where the 
feed is treated with the MoM and the reflector with the PO.) Note further that the 
boundary condition of zero tangential E is only rigorously enforced in the MoM 
region. 

In the basic MoM/PO hybridization outlined above, edge effects are not taken 
into account by the PO. It is possible to use Fock theory to account for these effects; 
see for example [14]. The approach used is related to Umfitsev’s physical theory 
of diffraction. 

For very large structures, the integration over the entire structure can still be- 
come very time consuming — although the O( f°) dependence of the MoM is re- 
duced enormously, the PO asymptotic dependence is still O(f7). 

Hodges and Rahmat-Samii have shown recently that the MoM/PO hybrid can 
be seen as a special case of a more general EFIE/MFIE hybridization, with the 
MoM/PO as the first term in an iterative Neumann series technique [15]. They 
show good results for two monopoles mounted on opposite sides of a cylinder, 
and thus in each other’s shadow region. However, the use of the MFIE restricts the 
method to smooth closed bodies. 


6.7.4 A FEKO example using the MoM/PO hybrid 


The above theory is available within FEKO, and we will now consider an example 
of its applications. For this example, one of the simplest (and also most effective) 
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applications of the MoM/PO hybrid will be chosen. We will mount a 4/2 dipole 
antenna horizontally above a finite ground plane, of 1A x 1A in size. From basic 
image theory, the “image” in the ground plane is out of phase, so the distance 
above the ground plane should be an odd multiple of 1/4 above the ground plane 
to produce constructive interference. 


Modelling hints — symmetry 


Once again, symmetry can be exploited to build the model and improve the com- 
putational efficiency. By mounting and feeding the dipole symmetrically about 
the y = 0 and z = 0 planes, magnetic and electric symmetry can be used. Note 
that the quarter-ground plane is imaged first in the y = 0 plane before the half- 
dipole is added; one does not want to image a wire on top of itself! Following 
this, the half-ground plane and half-dipole are then imaged in the z = 0 plane to 
create the whole model, and the feed segment is then added. 


Two approaches have been used to solve this problem: firstly, the MoM has 
been used for the entire problem; then, the MoM is applied to the dipole only, and 
the effect of the reflector is approximated using the PO. The FEKO models for 
both are shown in Fig. 6.9 — note that the models appear identical, since it is the 
mathematical approach, rather than the geometrical model, which differs. Results 
comparing the far-field H-plane (z = 0) radiation patterns computed using the two 
approaches are shown in Fig. 6.10. 

The results shown in Fig. 6.10 compare favorably. Using some advanced meth- 
ods within FEKO which correct the PO currents at the edge of the reflector, it is 
possible to do even better. However, a caution is in order. It must be appreciated 
that the MoM/PO hybrid is approximate; how good the approximation is relies 
quite heavily on the experience of the user. As such, it is useful to build confidence 
by initially comparing results using MoM/PO hybrids with full MoM solutions 
as far as possible. Efficient use of symmetry usually allows the solution of quite 
electrically large MoM problems, although these may of course take some time to 
compute. Once one is reasonably confident of the level of accuracy for a particular 
class of applications, one may then do production runs investigating changes to 
and optimization of the structure, etc. It would, however, be very unwise to base 
major design decisions on an MoM/PO hybrid solution which one has not carefully 
evaluated beforehand. (Of course, this is true in general of computed solutions, but 
even more so in this case.) Problems which generally lend themselves very well 
to the MoM/PO hybrid approach are reflector-type problems, since the radiating 
feed element is largely decoupled from the reflector, and ray-tracing issues are not 
problematic. 
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Figure 6.9 FEKO model of a dipole in front of a reflector. 


Modelling hints — using the MoM/PO hybrid within FEKO 


FEKO is the only commercial code offering this functionality at the time of 
writing, so the following discussion only applies to FEKO. Physical optics is 
controlled using the PO card, which offers a number of parameters which require 
some brief discussion. The first parameter, requiring a label, is obvious; the PO 
is applied to the structures with this label. The second parameter controls ray 
tracing. Because the PO is an optics-based method, in general one needs to ray 
trace to determine whether a triangle is in the “lit” or “shadow” region relative 
to the source. In this case, it is clear that all triangles are illuminated, and ray 
tracing may be switched off to save time. The third parameter relates to the 
use of symmetry in ray tracing and is irrelevant here since ray tracing has been 
deactivated. The fourth parameter controls MoM-PO coupling, as described in 
the previous theoretical section; here, the full treatment is applied and the regions 
are fully coupled. The fifth parameter is another optics based one; it determines 
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the number of multiple reflections to be taken into account. In this case, none 
are required. The final parameter is for specialized use and the default should be 
used here. 

FEKO offers additional functionality to improve PO modelling. The KA card 
permits one to define the boundary of the PO region, and “fringe wave” currents 
are then used in this region to improve the approximation. The VS card allows 
one to specify “visibility” information, to reduce the time required when multiple 
reflections are present. The FO card uses Fock theory to improve the PO surface 
current. 


6.8 Other approaches for the solution of electromagnetically 
large problems 


6.8.1 Background 


By the late 1980s, research on the MoM was confronted with the basic problem of 
the high asymptotic cost of the method — O(N°) in terms of number of unknowns, 
or O({kd]®) for surfaces, as we have seen for direct solvers. Little can be done to 
improve this further, apart from the application of high-performance computing (of 
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Figure 6.10 A comparison of the H-plane far-field patterns computed using the MoM and 
MoM/PO hybrid. 
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which more anon). Iterative solvers started attracting much attention in CEM the 
late 1980s — even though the basic algorithms, in particular the conjugate gradient 
(CG) algorithm, have been known since the 1940s — since the computational cost 
is O(N 2) per iteration, with overall cost O(NiterN 2) for Niter iterations. Clearly, if 
Niter can be kept well below N, algorithms with better scaling properties are pos- 
sible. It has to be said here that unfortunately, the considerable experience accu- 
mulated by many researchers over the years has indicated that it is very difficult to 
predict Niter for arbitrary problems; testing the algorithms on canonical problems, 
such as spheres, has frequently resulted in highly over-optimistic predictions. (The 
reason is the relatively simple eigenvalue structure of such problems; since the it- 
erative methods usually used variants of the CG method, the rate of convergence 
is heavily determined by the eigenvalue spectrum.) So, using iterative techniques 
alone is not sufficient — and in any case, this does nothing to the O(N*) memory 
requirements of the method, which is frequently as serious a problem as computa- 
tional cost. 

From a slightly different perspective, the integral equation formulations which 
we have worked with are essentially convolutions of the Green function with the 
currents. Familiarity with signal processing methods immediately suggests that 
convolution in one domain may be more easily implemented by multiplication in 
the Fourier transform domain; we will exploit this idea in Chapter 7, although for 
a slightly different purpose. But for now, the idea that one could use a Fourier 
transform immediately suggests the use of the fast Fourier transform (FFT), and 
indeed, this was one of the first successful “fast” methods in electromagnetics. 
However, it was limited in terms of application to general structures with arbi- 
trary meshes. An extension of this concept, the adaptive integral method, removes 
this restriction. However, it is an alternative approach, the fast multipole method 
(FMM), which provided the theoretical breakthrough in the early 1990s. In its 
most powerful multi-level form it reduced the asymptotic cost from O(N7) to 
O(N log N), and it is the most popular of the fast methods today. It was a break- 
through as significant as Berenger’s PML absorber,° although the theory is rather 
more complex, and efficient implementation in particular is challenging. (By com- 
parison, the PML is really quite straightforward to code.) Despite the complexity 
of the theory underlying the FMM, since it is starting to be offered by commercial 
codes at the time of writing,’ an elementary introduction is certainly appropriate 
at this stage. Before looking at fast techniques, however, we will briefly discuss 


6 Hopefully, this comparison will not cause confusion: the PML and FMM are entirely different methods, with 
quite different aims. 

7 FEKO appears to have been the first publicly available commercial code to incorporate the FMM; the frequently 
referenced Fast Illinois Solver Code (FISC) has numerous restrictions on its distribution, especially outside the 
USA, due to US military funding during its development. 
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high-performance computing, which is also an important topic when the solution 
of large electromagnetic problems is considered. 


6.8.2 High-performance computing 


All the methods and technologies described in this section had their genesis in 
the late 1980s. One approach to the problem of high computational cost, and one 
which is still bearing fruit today, was exploitation of the emerging technology of 
parallel processing. Parallel processing — or indeed high-performance computing 
(HPC) in general — simply provides more computational power, it does not address 
the fundamental algorithmic issue of computational cost, but can significantly push 
the envelope of any particular computational technique. At its heart, there are only 
two ways of making a given computation faster: either increase the rate at which a 
computer can process information, or do more operations at the same time. The for- 
mer of course has been the dominant technological drive through several decades, 
manifested by clock speeds which, for typical personal computers, have increased 
from some tens of MHz at the start of the 1990s to some GHz by the millennium, 
only a decade later. The latter has spawned a variety of methods; historically, par- 
allel processing originally split into pipelining and replication. 

Pipelining involves overlapping parts of operations in time and was the approach 
taken by the vector supercomputers, such as the early CRAY machines (the first 
of which was installed in 1976). Replication provides more than one functional 
unit (e.g. CPU), permitting operations to be performed simultaneously, and was 
the competing approach taken by large processing arrays. Another nomenclature 
encountered in the earlier literature was single instruction multiple data (SIMD) 
and multiple instruction multiple data (MIMD) machines. This taxonomy was was 
introduced by in the early 1970s [16]; a MIMD system described a computer con- 
sisting of a number of nodes, each with at least a processing element, operating 
independently on its own local instruction stream and data, whereas a SIMD sys- 
tem performed the same operation in lockstep to all data. Machines were also 
characterized in terms of how data were exchanged; many of the early experi- 
mental systems used were local memory, message passing systems. In these, all 
memory was divided up locally amongst the available processors, and a processor 
could only directly access its own memory. Access to the memory on other pro- 
cessors was done by explicit message passing, which was much slower than direct 
memory access. However, the problem of memory contention that complicated 
the other main competing approach to memory allocation, namely global memory, 
was removed with this approach. Technological advances have however blurred 
many of the traditional distinctions. Even the ubiquitous CPUs encountered in 
personal computers contain significant elements of pipelining and replication, 
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and increasingly sophisticated architectures now blur the global/local memory 
dichotomy. 

The basic concept of parallel processing was, and still remains, to provide P 
processors or processing elements, and by splitting the computational load, reduce 
the overall run-time by a factor as close to P as possible. Several methods have 
been proposed to characterize parallel computers, but the most widely used are 
speed-up and efficiency. Speed-up, S, is the ratio of time taken by an equivalent 
serial algorithm running on one processor, 7;, to the time taken by the parallel 
algorithm using P processors, T,,. Efficiency, €, is the speed-up normalized by the 
number of processors. Formally, 


(6.24) 


(6.25) 


S is usually bounded from above® by P, and € is hence usually bounded from 
above by 1. 

Some algorithms can be parallelized very easily and efficiently: examples are the 
FDTD and iterative methods. Some, such as LU decomposition, are rather less ob- 
vious, but can nonetheless be very efficiently parallelized with some clever data de- 
composition techniques. All the major algorithms in CEM have been parallelized 
with varying degrees of success over the last decade; perhaps the most problem- 
atic one has been the FEM, due to the large, unstructured, but highly sparse matrix 
characterizing the method. Examples of measured efficiencies on a transputer ar- 
ray are shown in Fig. 6.11. (The results are shown for slightly different numbers 
of processors; this was due to different interconnection topologies used for the al- 
gorithms.) These data were measured in the early 1990s, hence the problem sizes 
are small by contemporary standards, but nonetheless, establish the principle. 


An historical aside — the transputer 


In the late 1980s, PCs were limited by the 640 kB limitation on RAM im- 
posed by the then dominant operating system, DOS, and clock speeds were 
low. Supercomputers were (and for that matter still are) extremely expensive. 
A British company, INMOS, introduced the transputer, one of the first “comput- 
ers on a chip,” incorporating a CPU, floating point unit, memory and commu- 


nication links. (This was to become quite standard later, but at the time was 


8 Sometimes, architectural quirks resulted in “‘superlinear” improvement on specific problems, i.e. a speed-up in 
excess of P; usually, this was a result of the cache design. 
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revolutionary.) The transputer came in several different variants — the T800 
model was the one widely used in parallel processing. 

The transputer was a 32-bit RISC * design, capable of internal operation at up 
to 30 MHz — again, this must be seen from the viewpoint of the technology of the 
time! One T800 transputer was able to produce a peak floating point throughput 
of 1.5 Mflop/s. A novel feature, still not widely seen on other systems to this day, 
was the provision of four serial links providing comparatively high-speed com- 
munication either with a host processor or with other transputers. Additionally, 
all components could execute concurrently; each of the four links and the float- 
ing point processor could perform useful work while the other elements were 
executing other instructions. 

The transputer was a very powerful processor in its own right when intro- 
duced, out-performing the microVax, which was then the usual system of choice 
for numeric computations in universities and most research laboratories (out- 
side US government research laboratories). However, it was ideally suited for 
application in parallel processing applications, in particular due to the on-chip 
links, and a number of experimental prototypes and some commercial products 
incorporating transputers were produced around the world. 

The relentless advance of clock speeds in personal computer CPUs during 
the 1990s, combined with an over-dependence on a novel but ultimately com- 
mercially unsuccessful language-cum-operating system, Occam, eventually con- 
signed the transputer to historical notes such as this. However, its role as an inno- 
vative catalyst in affordable parallel processing should not be underestimated; its 
do-it-yourself bargain-basement philosophy, if not technology, inspired a gen- 
eration of computational scientists working at institutions unable to afford the 
extremely expensive supercomputers of the time, and still resonates today in 
current systems using Linux clusters. 


“Reduced instruction set computer. 


In this context, it is necessary to mention Amdahl’s “law,’? which states that 
if an algorithm contains both a serial and a parallel part, the relative time taken 
by the serial part increases as parallelization reduces that of the parallel part, and 
a law of diminishing returns holds: further parallelization has increasingly little 
influence on run-time. While this observation is perfectly true, for many prob- 
lems the ultimate aim is to increase the problem size that can be handled. Thus as 
more parallelization is made available, larger problems are tackled and the overall 


9 As with Moore’s “law” — that the number of transistors in integrated circuits doubles approximately every 
18 months — this is really an observation rather than a law in the sense as used in physics. 
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Figure 6.11 Comparison of measured efficiencies of parallel CG and LU algorithms on a 
transputer array, for an MoM problem with a total of N unknowns running on P processors. 
(Adapted from [17, Fig. 18] and [18, Fig. 12]). 


serial/parallel split remains fairly constant. In particular, the efficiency of many 
parallel algorithms is a function of grain size — the number of unknowns per pro- 
cessor, N/P. An example of this is shown in Fig. 6.12, which indicates that for a 
particular grain size, the algorithm has approximately constant efficiency. 

When HPC first came to the attention of the CEM community, it was often 
accompanied by highly specialized hardware, frequently purpose built, such as 
the transputer-based arrays mentioned here. However, relatively mainstream en- 
vironments are now the norm, reflecting a degree of maturity in the field. It is 
also notable that the old SIMD-MIMD classification has largely fallen away — 
HPC environments now are generally classified either as SMP (symmetric multi- 
processor), MPP (massively parallel processor) or distributed processing environ- 
ments. The first is currently epitomized by systems such as the Silicon Graphics 
Origin; the number of processors is typically fairly modest, but memory is essen- 
tially shared. The second is epitomized by the Cray T3-E, with a large number of 
processors accessing distributed memory, and the last by heterogeneous networks 
of standard workstations, again with distributed memory but much slower com- 
munication networks than the purpose-built ones incorporated into MPPs. (The 
T3-E actually combines elements of both SMP and MPP paradigms, since it also 
contains a globally addressable memory subsystem.) At the time of writing, yet 
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Figure 6.12 Measured efficiency of a parallel CG algorithm on a transputer array, for an 
MoM problem with a total of N unknowns running on P processors. (After [17, Fig. 7], 
©1993 ACES, reprinted with permission.) 


another new paradigm, “grid computing,” is emerging, with the aim of using the 
Internet as a global computer. 

During the 1990s, there was also a major shake-out in the HPC sector; a number 
of the machines (and manufacturers) referenced in papers at that time have long 
ceased trading. Thinking Machines Corp. and their Connection Machines (CM-2 
and CM-5), which were some of the few truly deserving the massively parallel tag, 
with thousands of SIMD processors, are gone. Kendall Square Research, whose 
machines had some innovative features, not least a physically distributed memory 
which was accessed as shared memory by application programs, using a system of 
multi-level caches, has also long ceased to function commercially. Cray Inc. and 
Silicon Graphics remain arguably the most influential commercial vendors in this 
field at the time of writing. 

A noteworthy aspect of the work reported in the literature on parallel process- 
ing is that no new specifically parallel algorithms have arisen in computational 
electromagnetics. Well over a decade back, when parallel computing first attracted 
serious interest, there was speculation in some quarters that the rise of massively 
parallel computers would trigger entirely new algorithms that were only feasible 
in massively parallel computing environments. With hindsight, such claims appear 
as primarily marketing “hype.” Additionally, it has to be commented that at the 
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Figure 6.13 Run-times for LU decomposition, compared for systems capable of sustaining 
1 megaflop, 1 gigaflop and 1 teraflop. 


time of writing, the usability of current high-performance computing platforms re- 
mains disappointing: a major bottleneck has been the inadequacy of parallel com- 
pilers and system software. Although considerably improved over the systems of 
a decade ago (where such system software was sometimes entirely absent), fun- 
damental items such as parallel I/O and easy-to-use parallel debuggers have not 
appeared. What is encouraging has been the emergence of two standardized “har- 
nesses” — parallel virtual machine (PVM) and message passing interface (MPI). 
These provide standardized high-level communication routines (via libraries) to 
route data between processes, removing, or at least greatly reducing, the hardware 
dependent implementations which characterized earlier work. 

Nonetheless, despite implementation issues which remain challenging, parallel 
processing has emerged as a very useful enabling technology; several commercial 
codes (such as FEKO) are available in parallelized versions for various platforms. 
Whilst one does not always appreciate the impact of incremental increases in per- 
formance, when compounded over decades the results are deeply impressive. In 
Fig. 6.13, the time required for direct matrix solution (LU decomposition) on sys- 
tems capable of sustaining 1 megaflop, 1 gigaflop and 1 teraflop respectively are 
compared.!° Comparing a 1 megaflop (typical of the late 1980s) and a 1 gigaflop 


10 The operation count for LU decomposition for a matrix of dimension N with complex valued entries is ap- 
proximately 8/3N 3 floating point operations. 
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machine (typical of current systems), one sees that for a problem with around 1000 
unknowns, the time has dropped from around an hour to a few seconds. A similar 
improvement is noted for a 10 000 unknown problem when comparing a 1 gigaflop 
and a | teraflop machine.!! 


6.8.3 FFT-based methods 


If we refer back to the very simple introductory thin-wire example of Chapter 4, 
specifically to Eqs. (4.15) and (4.16), we note that Zn is a function of only m —n 
and A. With a uniform discretization, as used there, the latter is constant, and hence 
we actually only need to compute one row of the matrix. This is known as Toeplitz 
(or translational) symmetry. The reason that this observation is important is that in 
this case, the product of this matrix with a vector can be implemented as a discrete 
convolution. 
In general, a discrete convolution is an operation of the form 


N-1 
m= > In&m—n (6.26) 
n=0 


or in matrix form 


&0 &-1 §&-2 °*: S1-N 
Jo 0 
Ji el 
| 82 81 80 Ba | as : (6.27) 
|: lees Leva 
on cael 


re 8N-2 §N-3 


&1 &0 §-1 


The N x N matrix in the above is a general Toeplitz matrix; all the elements of 
this matrix are described by the 2N — 1 entries in the first row and column. If the 
elements repeat with period N, so that 


8n—N = 8n; n=1,2,...,N—-1 (6.28) 


then the operation is known as a circular discrete convolution, and the N x N ma- 
trix above is circulant. Otherwise, the operation is a linear discrete convolution. 
Any linear discrete convolution of length N can be embedded into a circular dis- 
crete convolution of length 2N — 1 by extending the original sequence g to repeat 


'l Tn November 1998, a CRAY T3-E became the first supercomputer to sustain the latter rate of computation on 
a real-world computation. 
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with period 2N — 1, zero padding the sequence to length 2N — 1 and changing the 
upper limit of summation in Eq. (6.26) to 2N — 2. 

The discrete convolution theorem states that if Eq. (6.26) is a circular discrete 
convolution, it is equivalent to 


én = Jn&ns n=0,1,...,.N—1 (6.29) 


where the é is the N-point discrete Fourier transform (DFT) of e, and similarly 
jn and 2. The DFT will of course be implemented using the FFT algorithm. If 
Eq. (6.26) is a linear discrete convolution, then embedding as described above is 
used. 

Hence, the matrix-vector product of Eq. (6.27) can be efficiently implemented 
as 


e = FFT y! (FFT y(j)FFT y(g)} (6.30) 


In the MoM context, with a Toeplitz matrix, the matrix-vector product is thus ex- 
pressed as 


.s Zmntn = Zm ® ln (6.31) 
i=l 


where Z = Zim and @ indicates cyclic convolution, evaluated as 
[Z]{7} = FFTy! {FFT y (Zn) FFT y (1)} (6.32) 


Usually, {7} is an approximation of the current, typically {7}, at the Ath iteration 
of an iterative solver. 

Note that the convolution has become the Hadamard, or outer, product (i.e. 
element-by-element) and hence for an iterative algorithm, the O(N ?) cost of the 
matrix-vector product (usually required once or twice per iteration) has been re- 
duced to O(N log N). Also very importantly, the memory requirement is reduced 
from O(N?) to O(N). 

This can of course be extended to two and three dimensions, using two- and 
three-dimensional FFTs as appropriate; the requirement remains that the grid 
should be a regular Cartesian one. Indeed, three-dimensional FFT-based methods 
provide quite efficient ways of dealing with the volume integral MoM discretiza- 
tions. 

The adaptive integral method is an extension of this idea to triangular sur- 
face grids. In this case, the triangular subdomain basis functions are projected 
onto a rectangular grid so that the FFT can be applied for the matrix-vector 
product. 
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A mathematical aside — what makes the fast Fourier transform (FFT) fast? 


The FFT must rate as one of the top numerical algorithms of the twentieth cen- 
tury. Although first popularized by J. W. Cooley and J. W. Turkey in the mid 
1960s, perhaps as many as a dozen individuals had independently discovered, 
and in some cases implemented, efficient methods for evaluating the discrete 
Fourier transform (DFT), starting with no less a figure than Gauss in 1805. As 
usual, the treatment in [8] is both highly entertaining and informative, and the 
following is a summary thereof. 

Firstly, until the mid 1960s, the standard method for evaluating an N-point 
DFT of the discrete function h;, 


N-1 
H, = 3 hy e27tknIN (6.33) 
k=0 


was to define the complex number W as (note that i = /—1, the unit imaginary 
number, not a counter!) 


W=e2Zi/N (6.34) 


and then the DFT can be written as 


N-1 
Hn = >) W"* he, n=0,1,...,N—1 (6.35) 
k=0 


Clearly, for each n, this is the product of a matrix of size N x N (whose (n, k)th 
entry is W to the power of n x k) times a vector of length NV; this must be done 
N times (for each value of n) yielding an O(N 2) algorithm. 

One of the “rediscoveries” of the algorithm which provides one of the clear- 
est derivations of the FFT is that of Danielson and Lanczos in 1942. The 
DFT is written as the sum of two DFTs, each of length N/2. One is formed 
from the even-numbered points, one from the odd-numbered points. Mathe- 
matically, 


N-1 
i= ye e2riikIN f 
j=0 
N/2-1 Me 
a y ePRRODIN fy 4 3 e2TiKQI+VIN po 
j=0 a 
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N/2-1 N/2-1 
= - erHiti(NP) p + wk > ePTIRINNI2D) fy 

j=0 a 
= Fe 4 w* Fe (6.36) 


F¥ is the kth component of the Fourier transform of length N/2 formed from the 
even components of the original f;, and similarly F?? is the corresponding trans- 
form formed from the odd components. Although in the last line of Eq. (6.36), k 
varies from 0 to N — 1, not just N/2 — 1, the transforms FY and F? are periodic 
in k with length N’/2, so each is simply repeated through two cycles.* 

The neat point about this algorithm is that is can be applied recursively. For 
instance, Ff can now be subdivided in F¢° and Ff°. For N a power of two, this 
can be continued down to the point where one is left with the transform of length 
one — which simply copies the input to the output. There are log, N such recur- 
sions. These one-point transforms are then combined appropriately. Each such 
combination takes of order N operations, there are log, N such combinations, 
hence we have the O(N log, N) operation count of the FFT. 

The above is not a complete description of the algorithm; one still needs to 
perform some book-keeping to keep track of which one-point transform corre- 
sponds to which combination of even—odd subdivisions, e.g. F°°* for an eight- 
point transform. By bit-reversing the binary representation of each index of the 
input vector, it turns out that this can be done very efficiently. The interested 
reader can refer to [8, Section 12.2] for the details. 


“Another way of looking at this is that taking even-numbered points is equivalent to halving the sampling 
density, hence the aliasing frequency also halves. 


6.8.4 The fast multipole method 
A two-dimensional FMM prototype 


Whereas the FFT-based methods rely on the algebraic properties of the DFT, the 
fast multipole method (FMM) is based on the analytical properties of the Green 
function. Before we briefly introduce the full FMM, it is worth discussing a two- 
dimensional example originally developed by Lu and Chew, which captures the 
essence of the algorithm in a far more readily accessible form; it is presented in 
the following form in [19, Section 4.13] 

Assume a TM, PEC scattering problem. In this case, the EFIE is [19, Section 
2.1] 


E(t) = jknA-(t) (6.37) 
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where 


A(t) = af (t’)H (KR) dt’ 
z = 4j Zz 0 


R= Jie) —x@)P +0 - yer (6.38) 


with t a parametric variable describing position around the contour of the cylinder 
surface, and the unprimed and primed coordinates indicate source and field points 
as usual. Using subsectional pulse basis functions, as in Chapter 4, one obtains the 
usual MoM matrix equation, with impedance matrix entries which, for segments 
small compared to a wavelength, may be approximated by 


k 
Zn © arp H Ran dt’, Vm en (6.39) 


with w,, the width of cell n and 


Run =f Gm — tn)? + Om — Yn)? (6.40) 


This, then, is the conventional MoM solution of this problem. We will assume 
that there are no geometrical properties of the shape of the circumference that 
we can exploit. (For instance, if it is a right circular cylinder, and the discretiza- 
tion is uniform, we have a Toeplitz matrix and we can apply the FFT approach 
to reduce the cost.) If we seek the solution of [Z]{7} = {V} using a conventional 
iterative solver, the cost per iteration will be O(N7). The memory requirement is 
also O(N). 

Now, consider a fast approach for computing the product of the matrix-vector 
product. As usual, the circumference of the cylinder will be divided into N seg- 
ments (which need not be equal in size in this approach). Now, the new idea: we 
collect these segments into p groups |? of roughly equal size and number of un- 
knowns. We index the groups asi = 1,2,..., p; there are now N/p segments per 
group, indexed asn = 0,1,..., N/p — 1 in each group. One segment per group 
will be centered at a local origin (xj0, yio), whilst the other segment centroids 
are denoted by (Xin, Yin). For source and field cells closely located, the “near- 
zone,’ the calculation proceeds as usual. However, for other segments, sufficiently 
far separated that they are in the “far-zone,”’ an approximation will be used as 
follows. 


12 Tn the presentation of [19, Section 4.13], the terms “cells” and “segments” are used respectively. The lat- 
ter is rather confusing, since a segment in an MoM formulation is usually the sub-domain spanned by one 
(or sometimes a few) basis functions. The nomenclature used in this section corresponds to typical FMM 
usage. 


6.8 Other approaches for large problems 


R jmin (Xjm, Yim) 
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Figure 6.14 Groups i and j, with segments in and jm 


Group j 


221 


(xj0, Yjo) 


Consider the calculation of the field at (x jm, yjm) due to sources on group i (see 


Fig. 6.14) 
a N/p-1 
Bes Yim) = ae Dy inWn Hy” (KR jmin) 
n=0 


The distance function R jmin 1s approximated as 
Rjmin © Rjoi0 + Rim — Rin 


where 


R j0i0 = {Gio — xj0)? + (jo — vio)” 
Rim = (Xjm — Xj0) COSh + (Yim — yjo) sing 
Rin = (Xin — Xi0) COS ® + (Yin — Yio) sind 


(6.41) 


(6.42) 


(6.43) 
(6.44) 
(6.45) 


The angle ¢ denotes the orientation of R jo;9 with respect to the x-axis. (This is 
just the usual far-field approximation used in the derivation of the potential of a 


two-dimensional dipole.) 


Now, the asymptotic form of the Hankel function for large arguments, 


io ar 
(2) ae: 
H” (kp) = ikp 
9 (kp) iD 


2 2 _jkR; ‘KR; 
Hy (KRjmin) © HG? (KR jojoye F* Rim et Ren 


is applied, yielding 


(6.46) 


(6.47) 
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and thus Eq. (6.41) can be replaced by 
on N/p-1 
2 —jkR; : ‘KR; 
ES (jms Yjm) % ——Hy KR joie 8m YT jnwnet%" (6.48) 
n=0 
Hence, all interactions between the cells in groups i and j can be obtained from 
a single summation over the coefficients j,, and one Hankel function calculation. 
This involves O(N /p) operations. There are approximately p* combinations of 
far-zone groups, so the overall complexity grows as O(Np). It can be shown that 
the optimal grouping is p = JN, in which case the complexity is O(N*/). 
It is useful to separate the operations contained in Eq. (6.48). First, the sources 
on group i are aggregated together via the summation 
N/p-1 
Sim So jnwyet ithe (6.49) 
n=0 


Then, translation uses the Hankel function 
ayes 
ES (x i9, yj0) © mee (KR joi0) Si (6.50) 


to shift the field to the center of group /j. Finally, the scattered field is disaggregated 
throughout group j by a multiplication with the phase correction 


Be te Yin) DN e TKR im ES*"(x jo, yjo) (6.51) 
We find analogous steps in the full FMM. 


The full three-dimensional FMM 


The FMM rests on two identities. The first, a form of Gegenbauer’s addition theo- 


rem, states that 
eo dkolF +d ae 6) Soe 
———3— =-—jko VEY (21 + 1) ji(kod)h; (kor) Pi(d -r) (6.52) 
Ir+d| 120 


where j/(x) is a spherical Bessel function of the first kind, be (x) is a spherical 
Hankel function of the second kind, P;(x) is a Legendre polynomial, and d < r. 
All the special functions are as defined in standard texts, e.g. [20]. The second 
identity is a spectral decomposition of the product of the Bessel function and the 
Legendre polynomial, into propagating plane waves: 


4n(—j)! ji(kod) P(d -d) = f etked pg. P) a2k (6.53) 
S 
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where the integral is over a unit sphere S$ and k= kok. Substituting Eq. (6.53) into 
Eq. (6.52), and interchanging the order of addition and summation (which has been 
described as “illegitimate but expedient” [21]), we obtain the approximation 


EE 2510 f etkaq, h.7) ah (6.54) 
7 + d| an Js 
with 
L 
Tk?) = (PIA + Yh Kor) PEF) Sa 
1=0 


The first key point in the FMM is the function Tz, (k -F) =Tr(k, 9) with 
k = kor precomputed for various values of distance « and various angles @. This 
is a truncated multipole expansion, hence the name: it has been shown semi- 
empirically that the number of multipoles is approximately koD + 6(koD)!7 
(with D the maximum dimension applicable) for an accuracy of 10~°. 

The second key point of the FMM is that the interaction matrix is divided into 
near and far parts. Near interactions are computed as usual with the MoM, and the 
FMM does not change these at all (by contrast, FFT methods evaluate all matrix 
elements). Far interactions are evaluated approximately, using the above function 
T,. Basis functions in the far region are grouped into M localized groups — it has 
been shown that the optimal value of this is “N, with N the number of basis 
functions. 

The third key point in the FMM is that the (approximate) matrix-vector product 
may be done in O(N*/") operations. This is done by first computing the far fields 
of each group, then computing the Fourier components of the field in the neighbor- 
hood of each group generated by non-near sources, and finally adding the effects 
of the near- and far-group interactions. These steps are also known as aggregation, 
lumping the fields radiated by a group to the group center, translation and summa- 
tion, which sends the fields from one group to another and then sums them, and 
finally disaggregation, which distributes the received field to each point within the 
receiving group. 

By introducing a recursive hierarchy of groups, the operation count can be fur- 
ther reduced to O(N log N); this is known as the multilevel fast multipole algo- 
rithm (MLFMA). 

The above description is very cursory, and the interested reader is referred to 
Section 6.9 for references which provide far more detail. We should caution that 
the constants in the operation counts can be very large, easily on the order of many 
thousands or more (by contrast, for direct methods or matrix-vector multiplication, 
the constants are usually on the order of unity) so the FMM and MLFMA are 
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Figure 6.15 Run-times for typical LU decomposition, a very rapidly converging iterative 
solution and a well-optimized FMM solution on a | gigaflop system. (Adapted from [22, 
Fig.14.8].) 


only asymptotically “fast”; indeed for small to medium size problems, the FMM 
will probably be slower than the MoM. Furthermore, for large problems, highly 
efficient implementation is essential, otherwise the benefits are lost, so an FMM 
implementation is emphatically not a project for beginners. 

The impact of a reduction in asymptotic cost is not always immediately ap- 
parent. To illustrate this, Fig. 6.15 compares the run-time on a system capable of 
sustaining 1 gigaflop for N*, 100N* (as one might hope to obtain with a very 
rapidly converging iterative solver) and 1000N log N, as one might obtain with a 
very well optimized FMM code, as suggested by [22, Fig. 14.8]. Clearly, the im- 
pact of reducing this asymptotic cost is enormously significant for large problems; 
the difference with the assumed operation counts for 1 million unknowns is that 
of minutes versus decades! (In reality, the FMM code is likely to run for many 
hours at least, but the point remains valid.) It must be commented that the con- 
stants assumed in both the iterative and FMM cases above may well be extremely 
optimistic. 

The impact on memory is also highly significant; Fig. 6.16 compares the mem- 
ory required to store the full MoM matrix compared to the storage requirements of 
a proposed FMM implementation, as suggested by [22, Fig. 14.9]. (Note that each 
complex word requires 8 bytes to store in single precision on typical systems.) 
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Figure 6.16 Memory required for LU decomposition and a proposed FMM implementa- 
tion. (Adapted from [22, Fig.14.8].) 


Again, one should note that a real FMM implementation is unlikely to be this 
memory efficient. 


6.9 Further reading 


The electric and magnetic field integral equations are covered in a number of texts 
on electromagnetic theory and CEM. There are many equivalent different forms, 
depending on how the differentials are treated; those in this chapter are based on 
[23]. An introductory treatment may be found in [5, Chapter 12]. Good treatments 
will also be found in [22, Chapter 14], with more on the underlying theory in [1, 
Sections 6—9 and 12-3]. The topic is also discussed in [24]. A point which we 
have glossed over in this chapter is that both the EFIE and MFIE exhibit a phe- 
nomenon known as interior resonance. Essentially, a (theoretically) non-radiating 
interior eigenmode is also present in the MoM solution procedure, !? and due to nu- 
merical inaccuracies, the eigenmode incorrectly contributes to radiation. Canning 
showed that there is a component of the field equations which should annihilate this 
term, but that this term is slightly “off” in frequency in the discrete MoM solution, 
hence the problem. He proposed a method using singular value decomposition to 


13 We assume here the usual exterior field problem. 
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remove this term [25]; although it worked well for canonical problems, Steyn and 
the present author showed that it was difficult to apply to more general problems 
[26]. The topic of interior resonances in general has been quite extensively dis- 
cussed in the literature; in practice, it is usually a very narrowband phenomenon, 
and for simple problems (in particular ones where the eigenvalues can be predicted 
analytically) can simply be “smoothed” through, but a rigorous solution requires 
a combination of both EFIE and MFIE, as the combined field integral equation. A 
particularly comprehensive discussion of this may be found in [19, Chapter 6]. 

In the context of equivalent surface current modelling, discussions of the equiv- 
alence principle will be found in several standard texts; that in [5, Section 7.8] is 
especially useful. For the modelling of homogeneous and inhomogeneous material 
bodies, few textbooks discuss this topic —[1, Chapter 12] being a notable exception 
— and one will need to refer largely to journal papers. One of the earliest papers 
to consider this was Richmond’s [27], although his formulation was essentially a 
volume equivalence one. For details of the surface equivalence formulation, [28] 
provides a comprehensive discussion and an extensive, although not exhaustive, 
list of references. The discussion of the equivalence principle is often quite cur- 
sory; a particularly detailed study has recently been published by Booysen [29]. 

On hybrid MoM/PO methods, Jakobus and Landstorfer’s original papers [14, 
30] remain the best reference. 

Regarding parallel processing, the present author made some of the earlier con- 
tributions in this regard [17, 18, 31]; other early work may be found in [32]. With 
Cwik, the present author recently summarized much of the state-of-the-art [33]; 
this special issue contains papers by many of the researchers active in the field in 
the mid to late 1990s. 

There is now a large body of literature dealing with fast techniques in CEM. A 
very readable introductory treatment will be found in [19, Chapter 4]. Jin provides 
a detailed, up-to-date and yet succinct overview of fast methods in general in [22, 
Chapter 14], and this would serve well as a first reference for more detailed study; 
a fairly extensive list of references complements the technical descriptions. On a 
historical note, Bojarski is credited with the first use of the FFT method in electro- 
magnetics for this purpose,'* in a US Air Force technical report of 1971, although 
the work was only published in the archival open literature a decade later [34]. The 
application of the FFT to surface and volumetric scattering is well illustrated by 
the work of Zwamborn and van den Berg, of which [35] is a good example, and 
also by Borup and Gandhi [36]. For some of the early work on iterative methods, 
the papers by Sarkar contain useful descriptions of the iterative algorithms ([{37] 
is typical), but it should be noted that there are misconceptions in this and other 


14 He used the term “k-space” in his work rather than CGFFT. 
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papers about the nature of discrete operators. This led to a lengthy debate in the 
literature (see [38], for instance, as well as comments in [24, Chapter 1]); this was 
finally settled by Ray and Peterson [39]. Their closing comment is conclusive: 


While direct iterative methods may be very efficient for some problems, they are no more 
accurate than their moment-method analogs. 


On the FMM, the paper by Coifman, Rokhlin and Wandzura [21] remains a clas- 
sic; the paper belies its title, providing the essential ideas and outlining the imple- 
mentation in only six pages. (Note that they use the e~'“’ convention widely used 
in physics, so the signs of i are reversed relative to the discussion in Section 6.8.4, 
and the spherical Hankel function is of the first kind.) Chew and colleagues at 
Illinois have been prolific users of the method; their recent book provides a de- 
tailed discussion of the many applications [40], and their review paper provides a 
succinct overview of the field [41]. On the question of error control, the paper by 
Botha and the present author presents a detailed discussion [42]. 


6.10 Concluding comments 


In this chapter, we have studied methods of solving currents on surfaces using 
the MoM, starting with the electric and magnetic field integral equations. These 
may be real currents, in the case of a PEC, or fictitious ones, in the case of an 
homogeneous dielectric (or magnetic) body. Some theoretical background on the 
RWG surface basis functions has also been provided, since these are widely used 
in commercial codes. The ability to model homogeneous material bodies using fic- 
titious equivalent surface currents is very useful indeed; some MoM codes, such as 
FEKO, can also handle inhomogeneous material bodies, using an equivalent vol- 
ume current method, but the computational cost associated with this is extremely 
high, as we have seen (unless FFT-based methods are used). 

The much larger computational requirements of surface modelling as opposed 
to thin-wire modelling have been discussed comprehensively. A hybrid MoM/PO 
formulation has been outlined. Although inherently approximate, this permits large 
structures to be modelled with good accuracy provided caution is exercised; it is 
particularly useful for what is often called “installed antenna performance mod- 
elling,’ which frequently involves electrically small antennas mounted on elec- 
trically large vehicles (used here in the general sense to include aircraft, space- 
craft and ships). A commercial implementation of this theory is available and we 
have shown an example of its use. High-performance computing has also been dis- 
cussed; this continues to be an important enabling technology driving very large 
applications of the method. Finally, “fast” methods have been considered, includ- 
ing the original FFT-based methods, extensions in the form of the adaptive integral 
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method, and of course the fast multipole method. The last in particular rejuve- 
nated the method of moments in the early 1990s and has proven one of the most 
important theoretical advances in the MoM over the last two decades. 
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The method of moments and stratified media: theory 


7.1 Introduction 


Modelling stratified media is an important application of the MoM. A stratified 
medium is one consisting of homogeneous layers of material, each layer having 
different electromagnetic properties. This includes the general category of printed 
antennas, of which microstrip is the best known. (Microstrip technology is dis- 
cussed in more detail in the next chapter.) It also brings with it the problem of 
dealing with dielectric materials. Central to this is the issue of the Green func- 
tion! for the problem. The MoM relies on an appropriate Green function as the 
“field propagator.” Due to its perceived complexity, the topic of stratified media is 
generally regarded as an advanced one, and the coverage tends to be highly the- 
oretical, and frequently impenetrable without lengthy study. One reason for this 
is that historically, analysis focussed on the problem of a dipole above a dielectric 
half-space. There are a number of complex issues which this raises, requiring quite 
sophisticated analytical techniques to understand, in particular for the asymptotic 
cases where interesting radiation physics can be extracted. However, the analysis 
of a very important special case, namely the grounded single-layer microstrip line 
(or patch antenna), can be undertaken without undue complexity, at least for most 
practical cases where the substrate is relatively thin. 

In this chapter, a static analysis of a microstrip transmission line is first under- 
taken, to demonstrate the basic principles of the spectral domain and the derivation 
of the Green function. Following this, the dynamic analysis is introduced, and the 
Sommerfeld potentials derived from first principles. Although the work in this 
chapter is certainly not original, being based on a synthesis of the literature — in 
particular [1] — the presentation in the present format does not appear to have been 
thus undertaken in other works to date. 


1 Contemporary usage is “Green function” rather than “Green’s function,” in line with “Dirac delta function,” 
“Heaviside step function” etc. 
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7.2 Dyadic Green functions: some introductory notes 


The main reason for efficiency of the MoM formulation already discussed is the 
existence of suitable Green functions. The Green function G(r) is equivalent to 
the impulse response h(t) of system theory. Just as h(t) gives the response (in 
time) to a temporally impulsive source, so G(r) gives the response (in space) to a 
spatially impulsive (current) source. The response to a spatially distributed source 
is obtained by integration, and plays the same role in space that convolution in 
system theory does in time: 


y(t) =h(t) *x(t) — EG =GG@,7") *I (7) (7.1) 


We have already encountered the free-space Green function in our work in 
Chapter 4, although we made only passing reference to it then. In free space, the 
function is (moderately) simple: 

—jkR 


G@,?’) = (i+ vv) 2G,7), @@F)= cE (7.2) 


where R = |r —7’| is the distance from source to field point. Green functions can 
be obtained for either fields or potentials, and in the above, G(r ,r’) is the electric 
field Green function for free space, and g(7, r’) is the potential Green function for 
free space. We will primarily use Green functions for potentials in this chapter. It 
is worth highlighting that the Green function for free space is given in closed form 
and is trivial to compute (although the singularities which accompany it make an 
accurate MoM implementation anything but!). 

Some new notation has been introduced in the above. The double-overbar nota- 
tion indicates a dyad; this is a mathematical device which after multiplication by 
a vector, yields a vector. A dyad typically consists of the following terms, when 
written as a matrix: 


G** GY G*% 
G=|G* GY G® (7.3) 
G** G*y Ge 


It is also frequently written out in its component form: 


G=GYxxX+ GP KV4+ GPKZ+ 

G* 7X + SHV AG" ZA 
G* 2x + G29 + GY ZZ (7.4) 
The product of a dyad and vector is then computed using normal matrix theory 
or the usual vector dot-products. J is the identity dyad. Note that although both 
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operations § - G(r, r’) and G(r, r’) - § with § a unit vector (i.e. £, } or 2) are de- 
fined, only the latter has physical meaning as the potential due to an S-oriented 
source. 

However, for many applications (such as printed antennas, antennas above or 
buried in a real earth) radiation occurs in a stratified media environment, not free 
space. The presence of the stratified media greatly complicates the analysis. The 
Green function for an elementary dipole radiating in the vicinity of the strati- 
fied medium needs to be worked out. This was done many years ago by Arnold 
Sommerfeld — in 1909, he determined the field radiated by a short vertical electri- 
cal dipole above a dielectric interface. However, the passage of time has not made 
the theory any easier. In particular, the required integration in the complex plane 
brings with it a number of complex issues. Finally, the Green functions obtained 
are not given in closed form, and are computationally expensive to compute, so 
even implementations of seemingly simple problems require some thought. 

Before concluding this introductory section, it should be commented that there 
are a number of MoM formulations for stratified media. This chapter uses the 
mixed potential integral equation formulation (MPIE), introduced by Mosig and 
Gardiol [2] and used with great success for MoM formulations by a number of 
workers. However, before we outline this approach, we will consider a much sim- 
pler problem, which illustrates many of the issues: deriving the Green function for 
stratified media for electrostatics from first principles. 


7.3 A static example of a stratified medium problem: the grounded 
dielectric slab 


Central to stratified media formulations is the spectral domain transform. The 
Fourier transform is used to simplify the problem by transforming the partial dif- 
ferential equation(s) of electromagnetics in the spatial domain into an ordinary 
differential equation in the spectral domain. (Once again, the analogy with linear 
systems theory is strong.) To illustrate the basic concepts, we will derive the static 
spectral domain Green function for a microstrip structure, as shown in Fig. 7.1. 
This does not include radiation effects, which requires the full-wave solution of the 
problem, the topic of later parts of this chapter. This is still quite useful, nonethe- 
less: the quasi-TEM approach often used for transmission-line analysis renders 
the problem (quasi-)static. A solution can be used to compute the characteristic 
impedance and phase constant of the transmission line by making the calculation 
twice — once with the dielectric present, and once with the dielectric replaced by 
free space [3, p. 166]. Note that the structure is assumed to be of infinite length, 
thus there is no variation in y. 
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Conductor 


Figure 7.1 Typical microstrip structure. 


Conductor 


Figure 7.2 Stratified medium equivalent with impulsive source q(x, z) = 6(x)6(z — d). 


This formulation appears to have been originally presented in the engineering 
literature by Yamashita and Mittra [4]. They did not actually derive the Green func- 
tion; they were formulating a variational expression for the unknown charge dis- 
tribution on the strip, but the extension is straightforward. Their notation is largely 
followed here, except that ky is used as the Fourier transform variable instead of 
B, and d instead of h for the substrate thickness. Booton provides a similar deriva- 
tion [5, Section 10.3]. It is interesting to note that an almost identical derivation 
may be found in Schwinger’s lecture notes [6, Chapter 14]; although only recently 
published, these lectures were originally given in 1976. 

To derive the Green function, the Poisson equation for a spatially impulsive 
source of unit magnitude located at x = 0, z = d must be solved (subsequently, the 
case x 4 0 is also considered); see Figs. 7.1 and 7.2. Thus the partial differential 
equation to solve is: 


V(x, z) = —25(x)5(e —d) (7.5) 


The equation is transformed into the spectral domain; using the linearity of the 
Fourier transform, the eo <=> jk, transform property, and the Fourier transform 
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of the Dirac delta function, one obtains 


$0, "| 1 
—ky a P(ky, z) = 6G —d) (7.6) 
with ®(k,,z) the Fourier transform of the potential, also known as the spectral 
domain representation: 


Co 

B(ky, z) = / B(x, ze IE dx (7.7) 
—OO 

Note that this is now an ordinary differential equation in ®(k,, z). The homoge- 

neous differential equation (with the inhomogeneous source taken into account via 

a Neumann boundary condition) is now solved: 


2 
|-#2 + s| O(k,, z) =0, Wz td (7.8) 
dz? 

The boundary conditions are: zero potential at z = 0 and z — ox; continuous 
potential at the material interface at z = d; and flux discontinuous by the source 
singularity at z = d. These boundary conditions transform in a straightforward 
fashion to the spectral domain. The solution to Eq. (7.8) must be written in the two 
regions demarcated by the material interface. Note that even if €, = 1, this two- 
region approach is still necessary, so that the jump discontinuity can be enforced. 

The boundary conditions, transformed into the spectral domain, are: 


&(k,,0) =0 (7.9) 
(ky, 00) =0 (7.10) 
O(k,y, dt) = O(ky,d_) (711) 
cos tks, dt)= c06r 5 Hk, d~)—-1 (7.12) 
The solution of Eq. (7.6) is in the form of exponentials in each region: 
Bi (ky, z) = Ae’ + Bek, VO<z<d (7.13) 
x(k, D=CehE 4D, VWe>d (7.14) 


Equation (7.10) immediately yields D = 0, and Eq. (7.9) yields A = —B. Thus 
@) (ky, Z) = —2A sinhkyz (7.15) 
Applying Eq. (7.11) in the limit d+ — d one obtains 
ea lkeld* 


A = —C——_ 
2 sinh k,d- 


(7.16) 
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The d/dz terms in Eq. (7.12), again in the limit d+ —> d, are thus: 


d® (kx, 
meee ae) = +Ck, e*!4 coth k,dt 
v6 oe 
dD x (kx, : 
~ x5 Z) S==016| ea lkxld 
z eae 
Equation (7.12) yields: 
elkxld 


C= (7.17) 
eolkx|L1 + €, coth |ky|d] 
where the even property of the product of coth(k,d) and k,d has been used to 
make the required simplification k, cothk,d = |k,| coth |k,|d (assuming d > 0). 
The solution for Or(ky, z), valid in the limit dt — d for z > d is thus: 
elkx|(d-z) 


O(k,, 2 = 7.18 
rr) = Feit a ecoth lal a) 


We have dropped the subscript 2 since we are now on the interface. Note that for 
z = d, this reduces to: 
1 


®(k,,d) = 7.19 
ne €olkx|[1 + €, coth |ky|d] re 


This can also be written as: 


sh sinh |k,.|d 
O06 68) ee Ss ee (7.20) 
€0|kx| {sinh |k,|d + €, cosh |k,|d} 


(An interesting special case can be identified, viz. €, = 1. For this case, by ex- 
panding the hyperbolic terms in the denominator, Eq. (7.20) reduces to 
1 e lkl4 sinh |ky|d 


O(k,,d) = (721) 
€0 [kx | 


This can be useful in asymptotic analysis, where the Green function for a homo- 
geneous dielectric is used.) 

Equation (7.20) is the spectral domain Green function for a source located on 
the z-axis. The Green function is then the inverse Fourier transform of this: 


1% - 
G(x, 0) = = / (k,, d) e/** dk, (7.22) 
[o-@) 
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and for the general case of the source located at x’, this becomes 


Re age ; ' 
G(x, x’) = — / O(k,, d) ef °—*) dk, (7.23) 
27 Joo 
The required integral equation for the potential in terms of the charge distribution 
p(x, d) is thus 


O(x,d) = ia G(x, x')p(x', d) dx’ (7.24) 


This, then, is the spectral domain static Green function for a grounded dielectric 
slab. Unfortunately, we note that it must first be inverse Fourier transformed to the 
spatial domain, and doing this for each possible value of the argument x — x’ is 
very time consuming, since numerical integration is required. Interpolation tables 
are often used to accelerate the evaluation of the functions. Another approach is 
to formulate the entire MoM problem in the spectral domain, by using basis func- 
tions which have analytical Fourier transforms. This is described in detail for the 
quasi-static microstrip analysis problem in [7]. However, we will not pursue this 
further here. Instead, we turn our attention to the full-wave case, after first revising 
some concepts from electromagnetic theory regarding scalar and vector potential 
representations. 


7.4 The Sommerfeld potentials 
7.4.1 A brief revision of potential theory 


Before confronting the full-wave stratified medium problem, we will briefly revise 
some basic electromagnetic theory, in particular, potential theory. It is often useful 
to represent fields in terms of potentials. Classic elementary electrostatics uses 
E =-—V®. For high-frequency electromagnetics the electrostatic potential is of 
course incomplete, and a very widely used set of potentials is 


+ aA 

E==V6e — (7.25) 
ar 

B=VxA (7.26) 


It will be recalled that there is considerable arbitrariness surrounding the choice of 
potential (as is well known, a potential A’ =A+t Vé¢ with @ any suitable scalar 
function results in the same set of fields); this is usually resolved via a gauging 
process. The most widely used in RF engineering is the “Lorenz gauge,” with 


V-A= -( je) (7.27) 
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A and ¢@ must to be worked out from 


1 3° p 
V*-o - =— =-- 7.28 
? C201 € Ue) 
Ase aA _ Fi (7.29) 

ear 
In the frequency domain, these become 
(V+i?)o=-2 (7.30) 
€ 

2 2 > _ > 

(v +k ja = 57 (7.31) 


and these solutions — for differential current elements do and d J —are the potential 
Green functions. 

We have already commented that within one potential representation, the poten- 
tials are not unique. There is also more than one possible potential representation. 
Another set involving only electric and magnetic vector potentials may be used; 
this was originally introduced by Hertz. In this case, the potentials satisfy the fol- 
lowing Helmholtz equations: 


(v? 4: K) AS ay (7.32) 
(v? re K?) F=-«M (7.33) 


where M is the (fictitious) magnetic current. These are also sometimes written 


as TI? = —4— and Mt" = —“_. For the Hertz potentials, the fields in the spatial 
JOME JOME 
domain are given as: 


joucE =RPA+V-VA— jopuVv x F (7.34) 
joueH =RPF+V-VFE+ jouv x A (7.35) 


7.4.2 The Sommerfeld potentials 


Preliminaries 


In the stratified medium case, at least two approaches using potentials have been 
used. The former uses the field components normal to the interface as potentials. 
We will retain the convention of the preceding sections that the interfaces are in 
planes of constant z; hence, in this case, the potentials would be EF, and H,. An- 
other possibility is the use of the (Hertz) potentials, of both electric (A) and mag- 
netic (F ) type. If only z-directed components A, and F; are retained, this choice is 
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traditionally called the Hertz—Debye potentials. The final possibility, and the one 
we will investigate since it is the most popular, is the Sommerfeld potentials. 

The Sommerfeld potentials, in the absence of magnetic currents, assume F=0. 
A vertical electric dipole (VED), i.e. z-directed in our convention, needs only the 
A, component. A horizontal electric dipole (HED) (i.e. parallel to the x—y plane) 
will require a component parallel to the source. Hence, the dyadic in this approach 
will have only five non-zero terms: 


Ga = 8G + 263% + (9G + 2G2)9 + 2G%2 (7.36) 


In order to find these terms, we first need some additional background on the spec- 
tral domain. 


The spectral domain transform 


In the static case discussed previously, no 9 variation was assumed, and the Fourier 
transform was the usual one-dimensional one. For a general structure, we cannot 
make this assumption, and the transform (and inverse) becomes two dimensional: 


1 Oo : ‘ 
Filkxs ky) = = i / f(x, ype TB* eI’Y dx dy (7.37) 
—C 


1 ce ee 
f(x,y) = ff F (kx, ky) eT eS dky dky (7.38) 
a —oo 


It is useful to introduce the polar vector p = xx + yy (this is simply the usual 
radius vector in cylindrical coordinates, |o| = ./x2 + y*) and the radial spectral 
variable Kp =k, +k,¥. This permits the “del” operator V to be split into its 
transverse and normal parts as V = V; + Be. In the spectral domain, this becomes 

=. oe 0, 

V = jkpt+ ag" (7.39) 
Since the only spatial derivative remaining in the spectral domain is with respect to 
z, the shorter dot notation for derivatives will frequently be used in the following, 
for example W/z = W. Using the Bessel function Jo, the above transforms may 
be written as 


Flkp) = [ Jolky PFO) p'do (7.40) 


f(p) = [ Jotkp p)F (kp) kp dkp (7.41) 


This is known as the Fourier—Bessel or Hankel integral transform pair. These are 
best known amongst RF and microwave engineers as Sommerfeld integrals. 
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As in the two-dimensional static case, the introduction of these transforms 
permits the spatial domain differential equation (the Helmholtz, rather than the 
Laplace of the static case) 


(v? Fr K?) wv =0 (7.42) 


to be written in the spectral domain as the solution of an ordinary differential 
equation 


a? és 
(F5 = ”) wv =0 (7.43) 


where the parameter u in the traditional notation of Sommerfeld is given by 
2 2 2 42_ 32 2_ 42 
== 4R-P= Hk (7.44) 


The spectral variable k, is complex valued, and by convention written as kp = 
A + jv. A in this context is the real part of ky, and should not be confused with 
wavelength. 


Normal component representation 


One possibility for stratified media is the use of the normal fields E, and H; as 
potentials. The normal components satisfy Eq. (7.42) or (7.43) in the spatial or 
spectral domain respectively. In the spectral domain, the transverse components 
are given by: 


By = jk E, — ouky A, (7.45) 
Ey = jkyE, + oukyA, (7.46) 
Ay = jkyH, + wekyE, (7.47) 
Ay = jkyH, — wk E, (7.48) 


As in the static case, the boundary conditions transform in a straightforward 
fashion to the spectral domain. Hence, tangential field continuity across the layers 
is satisfied if €E 5 E ay wH,; and H, are continuous. Rather importantly, this means 
that the boundary conditions do not introduce coupled equations in E, and Hy. 
From the viewpoint of the Green functions, the potentials are the normal compo- 
nents, but we will not pursue this further now. The Sommerfeld potentials make 
use of some normal components, hence the discussion here. 
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Sommerfeld potentials 


In the absence of magnetic currents,” the Sommerfeld approach assumes F=0. 
A VED requires only the A, component, obtained from the spectral domain rela- 
tionship 


joweE, =k7 A; (7.49) 


This is obtained from the spectral domain equivalent of Eq. (7.34). E z is obtained 
as above. The other components may be computed from the spectral domain equiv- 
alents of Eqs. (7.34) and (7.35). It may be shown that one obtains the following in 
terms of the normal component representation: 


. Gz 
Gy aseeH (7.50) 
Jky 
. Gu 
kp GY = joueG¥ + “ H (7.51) 
y 
oe 
GC? =— 4 (7.52) 
ON ee 
Se. sey TRG? 
kG? = joneG? — at (7.53) 
xX 
kp G% = joueG¥ (7.54) 


Regarding boundary conditions at the interface, it may be shown — from Eqs. (7.34) 
and (7.35) — using these Sommerfeld potentials, that transverse field continuity 
implies that A, and A,/e must be continuous for a VED. For an x-directed HED, 


Ax. A x8 Az, and V- A /€ must be continuous, and a similar expression holds for 
a y-directed HED. The last condition couples normal and transverse components 
of the Green function, which hence cannot be independently computed. For this 
reason, it is usually easier to work with the normal field components, as will be 
done shortly. 

Symmetry also results in the following expressions, which we note although we 
will not use them further: 


Cl =G, (7.55) 

Ce Ge 

F = = ae (7.56) 
x y 


2 As an aside, it should be noted that it is possible to have non-zero F even with zero magnetic current M , due 
to the amount of arbitrariness in the potentials. 
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7.4.3 An example: derivation of G’,* for single-layer microstrip 


General multi-layered substrates are best handled using a matrix formulation. 
Within each substrate, the normal field components are computed for a unit Hertz 
dipole embedded in the layered medium. The boundary conditions are handled 
using “chain” matrices. A particularly complete description may be found in [1]. 
However, for the simple but very important case of a single-layer microstrip, we 
can directly compute the potentials in a fashion very similar to that described in 
Section 7.3. Once again, Fig. 7.2 is relevant, although now the impulsive source 
is a horizontal Hertzian dipole, and for convenience the air—dielectric interface, 
rather than the ground plane, is at z = 0 (and hence the ground plane is located at 
z = —d). In general, the derivation must be repeated for the five non-zero compo- 
nents of the Green function, viz. Eq. (7.36), but we will only derive one of these 
here — the x-directed magnetic Green function. We also restrict the derivation to 
non-magnetic lossy dielectric substrates, i.e. 41 = wo and €1 = eoe,(1 — tand). 
We will use €, = e; (1 — tan 8) to represent the complex relative permittivity in the 
following; it is useful to be able to distinguish between e€, and €7. 

The source-free ODE to be solved for the normal magnetic field in the spectral 
domain is of the form of Eq. (7.43), repeated here for the H, case: 


(ss e ”) Hi. =0 (7.51) 


The solution in each region may either be written as the sum of exponentials, as in 
Section 7.3, or as hyperbolic functions. In the upper region z > 0, the solution is 
of the form 


H, = aye" (7.58) 


which already incorporates the boundary condition at infinity. In the dielectric re- 
gion, the solution is of the form 


H, = a, coshu;(z +d) +b; sinhu;(z+ d) (7.59) 


The remaining boundary conditions on H, are: 


Lo Mr Hz |2=0- = Lo Hz|2=0+ (7.60) 
Hh = Halon (7.61) 
HS (7.62) 


The last boundary condition may not be immediately apparent. The perfect electric 
conductor at z = —d imposes a zero tangential electric field condition, implying 
zero normal derivative of magnetic field. 
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Table 7.1 Values of the amplitude coefficients U; and L; associated with the upper 
and lower parts of the layer containing the source (after [1, Table 1, p. 150]) 


d zy d zy 4 
Gi Gi GF Gr Ge 

Uj —jky/41uo ky /4iu0 — jk, /4ajwe —jky/4ajwe ke /Ast joeug 
Li Uj U; —U; —U; Uj 


The above are for the source-free case. In Section 7.3, the effect of the source 
was introduced via a boundary condition. Here, we will introduce another method 
of dealing with this. For a layer with a source inside it, this can be taken into 
account by adding a solution yw, which is the particular solution corresponding 
to the source embedded in an unbounded homogeneous medium. In the spectral 
domain, the solution can be written as 


U; e Hi iD) D <2z<d; 


ue = ie etHili—D) 0<z<D (7.63) 


for a source at z; = D, with z; = z + d; the local normal coordinate in each layer. 
The amplitude coefficients U; and L; depend on the physical quantity represented 
by w, and are tabulated in Table 7.1. (In the spectral domain, the transform of 
an HED of unit magnitude, 5(x)d(z = —D), is 1/27. The table takes this and 
other factors into account.) In the present case, this source will be located in the 
upper medium (free space) at D > 0; the limit case D — 0 will be considered 
subsequently. 
In the free-space region then, the solution is 


: ik 
A, =ape7#? — JY gtuii-D) Wd czeD (7.64) 
Ait uo 


in the region just above the interface, and for the rest of the region 


~ ik 
H, =age “* — JAY g-uici-D) Vz>D (7.65) 
Art uo 
It is tempting to set D to zero and use this latter equation immediately, but it yields 
the incorrect solution. 

We now apply the boundary conditions and eliminate the three unknown coef- 
ficients, ag, a; and by. Application of Eq. (7.62) immediately yields aj = 0. Ap- 
plying Eq. (7.60) for the non-magnetic substrate case (jz, = 1) in the limit D — 0 
yields 

ky 
a0 = tes 


bh = ——— 7.66 
f sinh u,d ( ) 


244 The MoM and stratified media: theory 


Application of Eq. (7.61), again in the limiting case, gives 


ky iky 
—57 + u,cothu,d 
a= An 4 ug (7.67) 
DtE 
where 
Dyg = uo + uy cothu;d (7.68) 


The Dre term (and a similar Dy term, to be defined shortly) are written in this 
specific notation because they are linked to surface waves. These can be important 
as a mechanism both for loss, and for increasing coupling between elements in a 
microstrip patch array. Neither is usually desirable. We will return to this later. 
The last coefficient, b}, may now be obtained, and we find for the fields in the 
dielectric that 
pees 
2x sinhujd Dtg 


sinh u1(z + d) (7.69) 


For the case where both source and observer lie on the air—dielectric interface, 
z — O and this reduces to 
& iky 1 
| ee eee (7.70) 
20 DTE 


What has now been computed is the spectral domain normal magnetic field due 


to an elementary x-directed dipole, i.e. G3 . From Eq. (7.50), we find that 


muy _uGiy _ bo | 
Jky 7 20 DTE 


(7.71) 


The other components required for a HED may be derived in a similar fash- 
ion. The results are given in Table 7.2. Here, the subscript 1 has been dropped 
on u, since it clearly refers to the substrate. For convenience, the spectral domain 
parameters u and uo are also listed. 


7.4.4 The scalar potential and the mixed potential integral equation 


The third entry in Table 7.2 lists a term which requires a brief comment, viz. Gy. In 
Section 7.4.1, the usual “mixed potential” formulation, Eq. (7.25) (which is valid 
for F = 0) was presented. It is actually by no means obvious that the usual scalar 
potential, 


v= / Gua hae yas’ (7.72) 
S 
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Table 7.2 Spectral domain Green’s functions for 
a single-layer grounded microstrip structure 


Sommerfeld potentials 


anGe a3 Gl 
Ho DTE 
2G ike(€r=V) 
Ho DreDM 
ry ug t+utanhud 
2meyGy = DteD™ 


Dtg = uo tucothud, Dry = €-uo + u tanhud 
2 _ 72 2 2. 7-2 2 
u =k, —k ; Up = ky — ko 


Both source and observer are on the air—dielectric 
interface (after [1, Table 2, p. 153]). ko is the 
wavenumber in free space, and k is the wavenumber 
in the dielectric. 


can be extended to a layered medium under dynamic conditions. Fortunately, in the 
case of horizontal conducting surfaces, it can be shown that this is indeed valid, 
and further that the required scalar Green function is given in the spectral domain 


by [1, Section 3.3] 
7 . Gex k 2 Gs 
seas) a) 
0 \ Ske Ko Tkye 


for the Sommerfeld potentials. 

Once the potentials are known, the fields can be computed from the potentials, as 
in Section 7.4.1. Before proceeding, it is worthwhile reminding the reader that the 
Green functions we have obtained are spectral domain representations; the spatial 
domain equivalents are of course defined by: 


ii las Se RT k 
GEBIB = 0) = AG) =H [sky p) FE dk (7.74) 
20 0 DtTE 


ug + utanhud 


>), > 1 ca 
GVBLP =O) = VB =—— | solkyrdke dkyp (1.75) 
rey Jo 


DtED™ 
and these are the functions we require. Again, as a reminder, p is radial distance on 
the patch surface, /x2 + y?; kK, is the integration variable; by convention, z = 0 
is the air—dielectric interface; and Jo(x) is the Bessel function of the first kind of 
order zero 


Jo(x) = z [ cos(x sin w)dy (7.76) 
T JO 
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Note also that these are the Green functions for a source located at p’ = 0; due 
to the translation symmetry, for sources located at a point other than the origin, 
all we need do is interpret the radial parameter as the distance from the observer 
to the source, i.e. 9 = V(x — x’)? + (y — y’)?. This is also sometimes expressed 
as 


G(x, yix', y) = G(x —x',y—y’0, 0) (7.77) 


Equipped with these Sommerfeld potentials, we can now write the mixed poten- 
tial integral equation (MPIE) for the x-directed HED: 


zx Fim 7 x jie [ Gictcds’ + vf Gvasds' + Zsis| (7.78) 
Ss Ss 


The vector potential G4 and scalar potential Gy are as in the preceding section 
and are of course known, even if difficult to compute, as is the excitation E'"°. 


7.4.5 Surface waves 


We commented earlier that the Drg and Dy terms are written in this specific 
form since they can be interpreted as surface waves. It can be shown that these 
expressions are the characteristic equations for the surface waves of, respectively, 
TE and TM waves propagating in a dielectric layer backed by a perfect conduc- 
tor [1, Section 6]. Surface waves can decay as slowly as 1/,/o, and hence can 
be an important coupling mechanism between patches in a microstrip patch array. 
In the integrals required to compute the spatial domain Sommerfeld potentials, 
Egs. (7.74) and (7.75), these enter in the denominator of the integrand, and zeros 
in Dtg and Dry hence represent poles in the kernel, complicating the integra- 
tion process. Fortunately, if kod ,/e/, — 1 < 2/2, then Dyg has no zeros and Dry 
has only one, corresponding to the dominant zero-cutoff TM surface wave. This 
condition is equivalent to the restriction: 


75 
d{mm],/e) — 1 


For practical substrates, this condition is generally satisfied over most of the mi- 
crowave band. Only in the case of a thick substrate of high dielectric constant need 
one be concerned with this requirement. 

The position of the pole is also required for the integration process. For loss- 
less substrates, the pole is real (kp =Apo) and lies inside the segment of the 


f [GHz] < (7.79) 
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real axis 1 < Apo/ko < ./é;. For thin substrates, an approximation of its position 
is [1, Section 6]: 


2 (€, — 1)? 


Ap0/ko © 1 kod 
po/ ko + (kod) 32 


(7.80) 
This expression also holds for low-loss substrates, although the pole then migrates 
below the real axis, as in Figure 7.3: 


Ap ~ Apo 


kod \* 
vp & (el. — 1) tans (“) (7.81) 


r 


7.5 Evaluating the Sommerfeld integrals 
7.5.1 Approximate evaluation of the Sommerfeld integrals 


In general, the semi-infinite integrals in the spatial domain Sommerfeld potentials, 
Eqs. (7.74) and (7.75), have no closed-form solution and numerical evaluation, the 
topic of this section, is required. In certain cases, however, approximate solutions 
can be used, and one useful one in the present context is for the magnetic vector 
potential A, for the HED case. Equation (7.74) does not contain the TM pole, with 
the result that the vector potential can be approximated by the vector potential 
for the homogeneous region €, = 1. (Physically, the argument is that this is the 
magnetic vector potential, which should not be much affected by thin dielectric 
sheets.) In this case, the approximation is 


A —jkoRo —jkoR1 
ack eee —- . , (7.82) 
be 0 1 


with Ro = p* and R? = p* + (2d)*. The latter is of course the distance from the 
image of the HED in the ground plane, and we recognize this expression as that 
of a dipole and its (reversed) image. Although not generally valid, this is a useful 
approximation, especially for thin substrates of moderate dielectric constant. Al- 
though an approximation of the scalar potential is also available [1, Section 7.2], 
it turns out to be far less useful in this case and will not be discussed here. 

Before proceeding further, the very important point must be made that the tech- 
niques to be discussed here emphasize simplicity, frequently exploiting knowledge 
of the specific problem: for instance, we restrict the analysis to the case of a sin- 
gle pole, and concentrate largely on the lossless substrate case. General-purpose 
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programs using the Sommerfeld potentials have to handle potentially far more 
complex problems, and research still continues on efficient and robust implemen- 
tations. 


A mathematical aside — integration on the complex plane 


The Sommerfeld integrals involve integration on the complex plane, ky = 4 + 
jv in the present context, or more usually z = x + iy in mathematical notation 
which we will use in this brief note. A few refreshers might be useful here. 
Firstly, a function f(z) is analytic (or regular) in a region of the complex plane 
if it has a unique derivative at every point of the region. This is a far stronger 
condition in the complex plane than on the real line, since an analytic function 
has derivatives of all orders. (Many real functions have only derivatives to a 
certain order.) The Cauchy—Riemann conditions can be used to test whether a 
function is analytic in a region. A singularity is a point where f(z) is not an- 
alytic; in the present context, it usually corresponds to an infinite value of the 
function. 

Cauchy’s theorem, and the resulting integral formula, are crucial: the theorem 
states that on a closed contour% C: 


f f(edz=0 
(6 


provided that the function is analytic on and inside C. 

A very important consequence of this is that if C=C,+C2, then 
Je, f@dz= ie f(z)dz. This is so important in the context of the 
Sommerfeld potentials that it is worth reiterating: provided that the function is 
analytic, different integration paths between two points in the complex plane 
yield the same result. 

Cauchy’s integral formula states that under the same limitations as above, the 
value of f(z) at z = a, a inside C, is given by 


1 
pay=— 4 22 


2mi Jo z—a 


dz 


We usually apply this in reverse: for a function analytic except for a simple 
pole at z = a, the above theorem permits us to evaluate the integral. Combined 
with Laurent’s theorem, this produces the residue theorem, which states that for 


“There are some limitations on the form of C — it must not cross itself, and only a finite number of corners are 
permitted. 
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isolated singularities within C, 
f Sf (z) dz = 277i URE 
Cc 


where Rx are the residues of f(z) inside C. We will discuss finding the residues 
subsequently. 


7.5.2 Numerical integration in the spectral domain 


The spatial domain Sommerfeld potentials, Eqs. (7.74) and (7.75), require inte- 
gration over the real positive axis 4.° We also note that since the integration is in 
the complex plane, the theory of complex functions permits deformation of the 
integration path, and a number of approaches avoid the pole(s), deforming the in- 
tegral into the first quadrant. (The reason that the deformation takes this route is 
as follows. As already noted, for a lossy dielectric, the pole lies below the real 
axis, and the integration (along the real axis) lies above it. In the limit, as the loss 
tends to zero, the integration path must remain above the pole.) However, the most 
straightforward approach for the case of a simple pole is to to integrate along the 
real positive axis and this is the approach discussed here. There are, however, two 
points along the axis that require special care — the branch cut and the pole — and 
an asymptotic case needing caution. 


Firstly, at k = ko, the function up = , [ke + ko introduces a branch point. This is 
due to the multi-valued nature of the complex valued square root function. Which 
value to choose is mathematically described as the process of selecting the cor- 
rect Riemann sheet. Fortunately, all we need note here is that we should choose 
Re[uo] => 0; since the integrand remains bounded at this point, we can integrate 
straight through the branch point. 


A mathematical aside — branch points and branch cuts 


Branch points and cuts arise due to multi-valued functions in the complex plane. 
The branch cut is used to demarcate “Riemann sheets,” which resolve the ambi- 
guities. As a simpler example, consider f(z) = z!/?. Obviously, with z = Ce’®, 
f(z) = VCe!*/. This is periodic, but with period 427, and this is where the prob- 
lems arise. For instance, consider 6 = 37/2 and 6 = —z/2, the same point on 
the complex plane. Now, the two solutions for f(z) are J Ce37/4 and /Ce'™/ 4 
clearly not the same point anymore. 


3 Once again, readers are reminded that in this context, A = Re[kp]. Since we will continue to use Ag as the 
free-space wavelength, the potential for confusion is present, but we follow the notation of the literature in this 
context. 
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Riemann sheets adopt some convention to resolve this ambiguity. In this case, 
f(z) for — < 6 < m is associated with the “top” Riemann sheet, and f(z) for 
mz <0 < 3m with the “bottom” Riemann sheet. This is best illustrated as below: 


Im[z] 


Im[f (z)] 


Rel f(z)] 


Re[f] <0 Re[f] > 0 
eB 


Bottom | Top 


Riemann Sheet "Riemann Sheet 


The negative real axis forms the branch cut in the z-plane, which opens up to 
define the boundary between the Riemann sheets in the f(z) plane. By alter- 
nating between Riemann sheets, the function f(z) can be made continuous. For 
instance, as one moves from 6 = z~ (on the top Riemann sheet) to 6 = xT, 
one must move onto the bottom Riemann sheet, which effectively resolves the 
ambiguity of which value of /—1 to choose, since we now know we must use 
m* and not —z~ when evaluating the function with this convention. In this 
case, there were only two Riemann sheets. Other multi-valued functions, such as 
In z, can have infinitely many values and require an infinite number of Riemann 
sheets. 

Which Riemann sheet one must work in the present context of Sommerfeld 
integrals often requires physical arguments, such as the radiation condition. This, 
and related issues, have caused many problems in the history of Sommerfeld 
potentials, with incorrect choices having led to unphysical artifacts and much 
debate in the literature. An extended discussion may be found in [8, 2.2]. 
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v 
kop =A+ jv 


Integration path C 


Branch cut 


Figure 7.3 Topology of the complex plane for a thin grounded substrate, showing the 
branch cut, pole positions and the integration path C. For a lossless dielectric, the pole 
is on the real axis (x,); when loss is present, it migrates into the fourth quadrant (x2). 
(Adapted from [1, Fig. 5].) 


The second point requiring attention is the pole, due to the TM surface wave. 
This introduces a rapidly varying integrand. Here, we follow [1, Section 8] and in- 
tegrate through the pole (which lies on the real positive axis in the case of a lossless 
substrate), using a special method to extract the singularity which we will describe 
shortly. Note that for the HED, and assuming that the inequality of Eq. (7.79) 
holds (i.e. only the TM pole is present) it is only the scalar potential V which is 
thus affected. 

The final point which one must bear in mind is that the oscillating integrands 
have an envelope which converges very slowly in the asymptotic case 1 — oo. All 
these issues are summarized in Fig. 7.3. 

In Fig. 7.4, the general properties of the function to be integrated are shown 
for a rather thick substrate with relatively large dielectric contrast; this has been 
done for clarity, to separate clearly the pole and the branch point, which in many 
practical cases lie close to one another. This figure shows the integrand of the scalar 
potential, Eq. (7.75), written in the following as: 


V(p) = a [ F(A)dxr (7.83) 
27 €9 Jo 
ug + u tanhud 
F(A) = Jo(Ap) A ———_ 
(A) = Jo(p) Deen 
= Jo(Ap) f(a) (7.84) 


where we have used ky = 4 + jv since the integration is on the real axis. 
It has been proposed [1, Section 8] that the real axis be split into three subin- 
tervals, namely [0, ko], [ko, ko./er] and [ko,/é,, co], and we will follow this 
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Figure 7.4 Properties of integrand associated with the scalar potential V for an HED. 
Parameters as for [1, Fig. 11]: €,. = 5; kod = 0.27; kop = 3; tand = 0.01. Note the omis- 
sion of z in the expression for kod [1]. 


approach here. We will investigate only the scalar potential V, since as mentioned 
above, the vector potential does not contain the TM pole and can be approxi- 
mated using Eq. (7.82) for the case we will study. In each region, we proceed as 
follows. 


Region I [0, ko] 
No special care is needed in this region, since the function is well behaved, apart 
from an infinite derivative at A = kp. A change of variables 4 = kg cost suffices 
to make the function very smooth and easily integrated using standard procedures. 
Hence, in region 1, the integral to evaluate numerically is: 


m/2 
: F (ko cos t)ko sint dt (7.85) 
0 


Note that the minus sign present in the differential dA = —kg sin dt is cancelled by 
the interchange of the lower and upper limits of integration required. 

The numerical integration in this and all the remaining regions can be per- 
formed in MATLAB using the quad function, which implements adaptive Simpson 
quadrature. (Simpson quadrature, the classic numerical integration routine, fits a 
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Figure 7.5 Detail of Fig. 7.4 in the region ko € [0.9ko, 1.4ko]. 


quadratic polynomial to the data points to be integrated; due to symmetry, it is 
exact to third order. The adaptive variant recursively divides the intervals until the 
difference between successive evaluations is less than some specified tolerance.) 
Many other types of numerical integration are available and can be applied; see, 
for instance, [9, Chapter 4] for an especially entertaining discussion. 


Region 2 [ko, ko./€r] 

In this region, enlarged in Fig. 7.5, the singularity caused by the pole is clearly 
present. Strictly speaking, with finite loss this is a numerical singularity (or a 
quasi-singularity), since the pole is now slightly below the real axis and the value 
of the function is not truly infinite at the pole; however, for practical situations 
with low-loss substrates, the values are numerically so large that the effect is that 
of a singularity; furthermore, for a lossless substrate, this is a true mathematical 
singularity. 

The approach used here is widely used for dealing with singular and quasi- 
singular integrands in integral equations. To the integrand is added and subtracted 
a function containing the singularity, whose integral can be evaluated analytically. 
In this case, the following is a suitable function: 


FA)= [Jo(Ap) f(A) _ Fring | + Feing (7.86) 
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where 


R 

Fesing Fer) ee earriay (7.87) 
Here A,» — jVp is the complex pole (with v, > 0) and R is the residue of the 
integrand at the pole. (We will discuss how to compute this shortly.) To simplify 
matters, we will limit ourselves to the case of a lossless substrate, hence the pole 
is on the real axis at 4 = A); the extension to the low-loss case is moderately 
straightforward, however. In this case, the integral in this region of the singular 
function may be found as [1, Eq. (110)]* 


koJe RR ko./e! — 2X 
= n= (Ee 
k 


ua A—Ap dp — ko 


—jmaR (7.88) 
0 
It is worth noting that this is the sum of the principal value (or Cauchy principal 
value) of the integral, and the contribution of the pole. (The principle value of a 


singular integral avoids the singularity.) The result for lossy materials is useful 
[1, Eq. (109)] 


2 2 
R v5 + koe — Ap) ko Jel —d hy —k 
I, = —In oe ed ae ea +jR arctan 2¥ or Pp jRarctan 2 2 
2 us, + (ko + Ap) Vp Vp 
(7.89) 


In Fig. 7.6, the original function F, the singular function Fsing and the dif- 
ference function have been plotted. The last is clearly smooth and readily inte- 
grated numerically. The smoothness has been enhanced by the change of variables 
2. = kocosht. The integral in this region is the sum of J;, the analytically inte- 
grated singular function as above, and Jy, the numerically integrated difference 
function:> 


ko/€. 
Iq= / [F(A) — Fsing| da 


ko 
arccosh €}. 

= | [ F (ko cosh t) — Fsing(ko cosht)] ko sinht dt (7.90) 
0 


One point that should be mentioned here is that for 7 = ko,/e/., u = 0, and the 
cothud term in Dr g, in the denominator of the integrand, results in a zero at this 


4 Note that this reference incorrectly includes the jz R term, jz P in their notation, on the left-hand side as well. 
Alternatively, the integral on the left-hand side should be a principal value integral. 
5 When performing the change of variables, recall that the derivative of cosht is + sinht! 
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Figure 7.6 The original function (F’), the singular function Fying and the difference func- 
tion, with the change of variables 2 = kg cosht. All parameters as in Fig. 7.4, except that 
tand = 0. 


point. Attempting to evaluate this numerically is inadvisable, and the upper inte- 
gration limit should be set fractionally below this value. (Since this is a zero and 
not a pole, this simple remedy suffices.) 

One final point requiring discussion is the evaluation of the residue. For a func- 
tion of a complex variable z, with simple pole at z = z,, which is the case we have 
here, the residue can be computed by multiplying the function by z — zp and eval- 
uating the result at z = zp. It is instructive to attempt this numerically, as shown in 
Fig. 7.7. The theoretical value is R = 15.1107; if the numerical result is interpo- 
lated through the pole, one will obtain a value very close to this. The reason that 
the curve in Fig. 7.7 exhibits a linear decay to zero in a small region around the 
pole is no doubt due to numerical approximations made (by MATLAB, in this case) 
when evaluating extremely large-valued functions. 

The residue may be found rigorously noting that the integrand is of the form 
g(z)/h(z), with h(zp) = 0, but h' (Zp) # Oand g(z,) # O. In this case, the residue 
may be computed from 


8 (Zp) 
Gp) 


R(Zp) = (7.91) 
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Figure 7.7 Result of attempting to evaluate the residue at the pole numerically. Parameters 
as in Fig. 7.6. 


For the TM pole, the result is: 


Jo(App)Aplo + u tanhud) 


R(A,) = (7.92) 
, Dit Dre = Drs Dm 
with 
d na AK 2 
— Dtp = — + —cothud — Xd csch*ud (7.93) 
dx uo Uu 
d A AK 2 
— Dm = €-— + —tanhud + Ad sech“ud (7.94) 
dir uo U 


In deriving this result, note that du/dA = A/u and duo/dd = d/uo. 


Region 3 [ko./ér, ©] 
In this region, the function has no singularities or branch points, but contains a 
slowly converging integrand, as shown in Fig. 7.8. To accelerate the convergence, 
the static term 


Jo(ap) 
l+e, 
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T 
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Static term extracted 
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Figure 7.8 The integrand in region 3, before and after subtraction of the static term. 
Parameters as in Fig. 7.6. 


is extracted. Beyond a certain point A > 2’, the resulting integral is negligible. 
Using the standard result (for example, [10, Eq. 24.92]) 


ve. 1 
y KGa = 
0 p 


one obtains 


lee) nv Tor 
/ F(A) an~ | [Fa _ Jol 2) aa 
J etko €,ko 1+, 


1 1 
+ = 
p(+e,) l+e, 


J etko 
i Jo(Ap) dr (7.95) 
0 


The question of how large to set A’ can be determined iteratively. The results to 
be shown started with 2’ = 10ko; the resulting integral was evaluated, as well as 
the integral with A’ = 20ko. The difference, normalized by the integral in region 2, 
between the integrals was then compared, and if too large, the procedure was re- 
peated with the upper limits doubled. (The integral in region 2 is usually the largest 
contributor to the integral, since it includes the contribution of the pole, and hence 
was used to normalize this result.) This process is not especially robust, and more 
sophisticated procedures are available [1, Section 8.2]. 
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Figure 7.9 Plot of Dry. Parameters as in Fig. 7.6. 


7.5.3 Locating the pole 


The position of the pole must of course be found with considerable accuracy for 
the above process to work properly, in particular in region 2. An approximation of 
its position has already been given in Eq. (7.80), but this is not sufficient for the 
singularity extraction procedure. Finding the pole is equivalent to locating the roots 
of Dy». In general, finding the roots of a non-linear function is a very challenging 
problem, but in the case under consideration, the pole is known to be single, and 
located on the real axis in the interval [ko, Jetkol. Furthermore, as Fig. 7.9 shows, 
the function is purely real valued for 1 > ko (the branch point) and changes sign 
in this interval [ko, Jetkol. A very simple algorithm, such as interval bisection, 
yields the root easily. Interval bisection starts with an interval containing a root, 
with the function having opposite signs at the interval limits. The function is then 
evaluated at the midpoint of the interval, which then replaces whichever limit has 
the same sign. This proceeds until the root is found with satisfactory precision. 
Despite its simplicity, the algorithm is failsafe in the present case — since it will 
always find at least one root, and there is only one. The method also converges 
linearly which is more than sufficient. The algorithm is so simple as not to require 
listing; details can be found in any book on numerical analysis, such as [9]. 
Slightly lossy materials can also be accommodated, although the root finder 
must now work with complex values; fortunately, although Dt) is now complex- 
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valued in the search interval, the overall shape of the function remains very similar 
to that of Fig. 7.9, and the imaginary part is small in the search interval. 

Here we should comment that all the above holds only for the case of the single 
pole. As soon as more than one pole is present, the pole finding becomes far more 
complex. It is this type of complexity which makes robust, general-purpose codes 
so time-consuming to develop. Further details may be found in [1]. 


7.5.4 General source locations 


The above potentials all assume that the source is located at (x’ = 0; y’ = 0), 
i.e. p’ =0. For sources at other locations, all that is required is to substitute 
p = V(x — x’)? + (y — y’)?. This is sometimes written as V(p|p’). 


7.5.5 Some results for the Sommerfeld potentials 


Now that the question of the integration of the potentials has been addressed, we 
can turn our attention to the potentials themselves. Results are shown in Figs. 7.10 
and 7.11, which illustrate the variation due to different substrate thicknesses; note 
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Figure 7.10 The modulus of the normalized scalar potential for various normalized thick- 
nesses as a function of normalized distance. €, = 10; b = 2kod/e€, — 1/z. 
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Figure 7.11 The phase of the scalar potential for various normalized thicknesses as a func- 
tion of normalized distance. €, = 10; b = 2kopd Je, — 1/z. 


that the results all converge for very small distances; this is the quasi-static limit. 
Figure 7.12 shows the effect of various dielectric constants. It is interesting to 
note the “knee” which sets in at progressively smaller distances as the dielectric 
constant increases; this corresponds to the transition from static to surface wave 
behavior. It will be noted that the potential decays at a slower rate once the surface 
wave sets in. It will also be noted that the surface wave is absent in the case of 
€, = 1.01; this is essentially free space, which does not support a surface wave. (To 
avoid problems in the routines used, a value slightly larger than unity was used.) 
The effect of increasing dielectric constant has already been noted; for practical 
antenna design, this means that high-e, substrates are likely to have more problems 
with mutual coupling between array elements. The same effect is also present as 
the substrate thickness is increased. 

These results are very similar to [1, Figs. 19-21] and serve to validate the im- 
plementation thus far. 


7.6 MoM solution using the Sommerfeld potentials 


Now that the potentials are available, the MoM discretization of the MPIE, 
Eq. (7.78), can be undertaken. Before we do this, it is useful to identify a suitable 


7.6 MoM solution using Sommerfeld potentials 261 


Figure 7.12 Effect of the dielectric constant on the scalar potential. d/A9 = 0.05. 


problem. Although microstrip patch antennas® 


are the dominant application of this 
theory at present, they require a surface discretization, supporting vector currents 
(that is, the basis function must be able to support both x- and j-directed currents). 
A printed dipole is a rather easier problem, since so long as the dipole is relatively 
thin, the current flows essentially along the axis of the structure, much as for the 
thin dipole in free space that we have already studied in Chapter 4. A printed dipole 
is also easily simulated using a commercial code, such as FEKO. 

With a suitable problem identified, various possibilities arise with the MoM. 
Perhaps the most popular, especially for “do-it-yourself” research codes, have 
been “rooftop” basis functions, defined on rectangular elements.’ More sophisti- 
cated codes generally use the Rao—Wilton—Glisson element. For testing functions, 
Galerkin procedures have been widely used; another popular option has been a 
pulse-doublet testing function. Collocation techniques have also been used. We 
will take the opportunity to do something a little different (although also used in 
the literature), namely utilize entire domain basis functions. A very obvious one 
here is a Fourier series expansion; for a symmetrically excited dipole (e.g. center 
fed) only a cosine series is needed, and only the odd numbered terms. 


© Readers not familiar with this technology should note that some more background on these antennas is pre- 
sented in Chapter 8. 

7 The term patch instead of element is frequently encountered in the literature; the potential for confusion with 
the patch antenna is obvious and hence element is used here. 
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It is useful to develop the MoM equations from basic principles as another exam- 
ple of the application of the method. Referring back to the mixed potential integral 
equation, Eq. (7.78), repeated here, but with the last term dropped: 


Zx Em =Zx jie [ Gin Jeds’ + v| Gvasds'| (7.96) 
Ss Ky 
the current is expanded as 
N 
JX Ps en oe (7.97) 
m=1 


From the continuity equation, the charge is therefore expanded as 


> 


—V - Fin 
qe > ar (7.98) 


Note that the basis functions are effectively scalar in this case. 
Introducing testing functions W,, and carrying out the weighted residual process 
as usual, we obtain: 


N 
zx | Wy Bas =z x ey [io [tis [Ga Fnas'as 
S ae S s 


= a W,, - vf GyV'- Fn as'as| (7.99) 
JOSS 

One subtlety worth commenting on here is the manipulation of the second surface 
integral on the Tight- -hand side of the above equation. Using the vector identity 
V(ab) = =aV-b+ bVa, and identifying b = W and a as the inner integral, one 
obtains 


ix [wv [ov Fnas'as=zx [v| i, [ Gv" Fnds'|as— 
Ss S Ss Ss 
ix [ v-tiy [ Gvv'-Fnas'as 
Ss S 
(7.100) 


The first term on the right-hand side in the above may be eliminated by applying 
a variant of the divergence theorem, known as the surface divergence theorem. For 
an open surface S bounded by contour C, this states that for a vector function f 


[ve f= f m-fdC (7.101) 
S Cc 
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with m the unit vector normal to contour C, but tangential to surface S [11, p. 712]. 
V;- is the divergence operator in the surface, and this is precisely what Zx selects.® 
Note that unlike Stoke’s theorem, the contour integral in this case evaluates normal 
fields on the boundary. Hence this term can be written in terms of a contour integral 
of a quantity related to current, normal to the bounding contour. Since normally 
directed current should go to zero at the edge of the dipole, this term is zero. 
Strangely, few references on this topic explain this point. 

The MPIE thus results in the standard MoM matrix equation [Z]{/} = {V}. For 
convenience, it is useful to split the impedance matrix in two 


Zmn = 4mn + Vinn (7.102) 


with matrix and vector entries as follows: 


Amn = J 0 | Fn (p)- .F, dS' dS 


1 
ees oe Fn (p) - oe F, dS' dS 
jo 


bm = i eee (7.103) 
S 


For the case of a thin printed dipole, we will make a number of assumptions 
similar to those of our earlier work on the thin-wire dipole. It will be assumed 
that the current flows only in the <-direction, and that the surface integrals can be 
approximated as line integrals. In this case, the integral in the transverse direction, 
y, simply results in a constant W, present in both [Z] and [V], and thus cancelling. 
Further, the equations (7.103) can be rewritten in scalar form. The result is the 
following: 


ann = 50 f Fn w | Ax(\x — x"|) F(x’) dx’ dx 
vm = = f Fate’) Vix — Dafa (x) dx’ dx 
bm = i Fin (x) E™ dx (7.104) 
As already mentioned, we intend using entire domain basis functions. In this 


case, the source (primed coordinates) and field integrals are over the same domain, 
namely the length of the wire. Assuming that we center the wire at the origin, 


8 The surface divergence operator can be defined in terms of general curvilinear coordinates for curved surfaces, 
but in the present case it is unnecessary. 
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suitable entire domain basis functions are: 


Fin = 08 ( eA (7.105) 


mr) 
Note that by this choice, the current goes to zero at the ends of the wire (x = 
+1/2) as required — for all the basis functions and hence also for their sum. With 
the above geometrical assumptions and these basis functions, and noting that the 
domain of integration is the same for both source and field points (the length of 
the wire), the matrix entries become: 


2 ome pli nx! 
damn = jo [ cos (“*) f Ax (|x — x’|) cos (*) dx' dx 
L/2 | es L 


1 mnx? ti mmx fp nx! 
Umn = =~ i; sin ( ) / V(\x — x’|) sin (=) dx' dx 
jo L L/2 L —L/2 L 


L/2 MTEL, the 
bn = f cos (=) £ dx (7.106) 


For the source, we will assume a very short feed section, of length As. The 
incident (impressed) electric field is thus V;/As, where V, is the source voltage. 
The result is that 


bm © Vs (7.107) 


It is interesting to note that the same result is obtained by assuming an infinitely 
thin Dirac delta source, with E ig = V,d(x). 

The code can now be developed. The integration required must be performed 
numerically. In this case, a simple trapezoidal scheme will suffice (implemented in 
MATLAB as trapz). An issue which requires a little care is that of singularities; 
both the vector and scalar potentials exhibit singularities at the origin. Fortunately, 
the singularities are of low order — this is one of the appealing features of the MPIE. 
The rigorous method for handling this extracts the singular component (which in 
both cases is the static limit), integrates this analytically and the remaining part is 
integrated numerically, in a fashion already applied in region 2 when evaluating the 
scalar Sommerfeld potential. This works very well for subdomain MoM methods 
and is relatively easy to implement, since it need only be applied to the “self” 
term; unfortunately, with entire domain basis and testing functions, it is rather 
more difficult to use. Because the singularity is of relatively low order, it can be 
side-stepped numerically, by using integration points for the field and source point 
integrals which are slightly offset from one another. If there are N equally spaced 
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integration points A = L/N apart, instead of sampling at 
ap = -L/2+ Aj2+AQy—1) 
(and similarly for Xf) one can use, for instance, 
xj =—-L/2+A/3+AQG —-1) 
and 
x, = —L/2+2A/3+ AQ —1) 


which offsets the points by A/3. 

To keep things simple, it will also be assumed that the substrate is thin enough 
that the low-frequency approximation of the magnetic vector potential may be 
used, namely Eq. (7.82). 

One other issue which requires attention is computational efficiency. Usually, 
the first implementation of a new method can be done with little regard for this. 
However, the Sommerfeld potentials are sufficiently time consuming to evaluate 
that if some thought is not given to this, even simple problems take far too long 
to solve. Because of the dependence on wavenumber, the potentials are frequency 
dependent, and nothing can be done about this. However, for a particular antenna 
geometry at a specific frequency, the potentials are only a function of radial dis- 
tance p (and in this one-dimensional case, |x — x’|) and a widely used approach 
is to pre-compute the potentials and use interpolation when constructing the MoM 
matrices. This significantly reduces the time required to fill the impedance matrix. 

Results for a MATLAB implementation are shown in Fig. 7.13. The printed 
dipole has length L = 0.39A9 and width W = 0.002A0, with relative permittiv- 
ity €, = 2.55, as in [12]. This dipole was designed as an element in a very large 
array, with Ag the free-space wavelength corresponding to the center frequency. 
For this simulation, this was chosen as 10 GHz, well into the microwave band and 
a typical frequency where microstrip is an attractive technology. (Because this is 
a single element, one can expect the actual center frequency to differ from this 
value; it turns out to be around 0.9 of the design value.) The substrate used in [12] 
is very thick (although only the TM mode propagates), and the approximation of 
the magnetic vector potential with its static value is insufficiently accurate, so the 
simulation here used a thinner substrate, = 0.12A9 thick. 

Figure 7.13 shows three results: one computed using FEKO (h/\ = 50 dis- 
cretization), and two computed with a MATLAB code based on the formulation 
developed here. The “coarse” result was computed using only 1 mode, with 32 
integration points; the “fine” result used 5 modes and 128 integration points. The 
reflection coefficient is computed in a Zp = 50 Q system. (It should be commented 
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Figure 7.13 Reflection coefficient of a thin printed dipole. 


that this antenna is not very well matched: the reason is that the substrate is not 
thick enough to provide sufficient spacing between the antenna and its image in 
the ground plane.) Improving the MoM model (the “fine” result) produces a value 
for minimum $j; very similar to the FEKO result, although at a frequency some 
8% higher. 

This is not very accurate — certainly not sufficient for engineering design pur- 
poses — but provides verification of our formulation and implementation. The aim 
of this section has not been to develop an accurate engineering tool per se, but 
rather to demonstrate the basic operation of the Sommerfeld approach and this 
has been achieved. Nonetheless, there are various things one could to to improve 
this scheme. Firstly, the magnetic vector potential should be implemented as a full 
Sommerfeld integral, rather than approximated by its low-frequency value as at 
present. Secondly, the integration scheme used in region 3 of the Sommerfeld inte- 
gral would benefit from some refinement. Thirdly, the singularities in the MoM 
impedance matrices should be properly addressed; a subdomain MoM scheme 
might make this easier. We will not, however, pursue this here. There are suffi- 
cient problems remaining in this field that entire books can (and have) been writ- 
ten on this topic. Instead, we turn our attention in the next chapter to the use of 
a commercial package which has a very comprehensive implementation of this 
theory [13]. 
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Coding hints — coping with complexity 


The implementation of the Sommerfeld formulation discussed here is one of 
the most complex coding tasks in this book. The author’s implementation used 
one main MATLAB m-file and some nine or ten functions, each of course in its 
own file. The total code ran to around 300 lines of MATLAB. This sounds quite 
modest, but MATLAB is particularly terse due to the vector nature of much of 
the code, implicit typing and high-level functions available (e.g. matrix solution, 
Bessel functions), so this would probably run to several thousand lines of code 
in languages such as FORTRAN, C, C++ or Java. How does one cope with the 
complexity that this brings? Here are some tips gleaned from twenty years of 
coding. 


e Firstly, start modestly. Do not try to develop a general-purpose program from the 
start — unless of course this is one’s job description. (Even then, one would be advised 
to code a simple implementation first, to learn the basics of the method if one is not 
familiar with it.) Writing general-purpose software is astonishingly difficult and time 
consuming, which is why good CEM software is not cheap. 

e Secondly, use existing packages where possible. Writing an LU factorization routine 
is really unnecessary: there are industrial strength routines available in the excellent 
public domain LAPACK suite. Evaluating special functions is also more complex 
than it appears; books such as [9] offer routines* for Bessel functions, root finding 
etc. 

e Thirdly, use a proper scientific programming environment. Writing one’s own code 
for complex numbers is absurd — find an environment which supports this, or at least 
has proper libraries. Life is too short to code a + jb! By and large, computer sci- 
entists appear to prefer to ignore complex numbers, and it usually takes some time 
for whatever the latest fashionable programming language is to include this. This re- 
mains one of the strengths of FORTRAN — complex numbers are a built-in datatype. 
MATLAB is especially suited for the type of development discussed in this book, due to 
the very large number of high-level routines available, excellent support for complex 
numbers, and ease of graphing. For CEM coding, systematic, disciplined and modu- 
lar work usually leads to far better code than supposedly state-of-the-art advances in 
languages. 

e Fourthly, modularity is a key to successful code development. Whilst languages such 
as C++ have taken this concept much further with object orientation, the basic idea 
of this is common sense: test sections of the code independently as far as possible. It 
is much easier to locate the problem in the evaluation of, for instance, the scalar Green 
function in region 2 when this is implemented and tested separately, than to track this 
down as part of a complex code. 


“Be warned that these are not public domain codes. 
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e Fifthly, debug intelligently. This is best discussed by anecdote. The key question is: 
what are the symptoms of the bug? Two which caused problems in the present case 
were: evaluation of a special function at a singular point (fortunately, MATLAB warns 
of this, and then it was just a question of locating the specific call and value of the 
argument); and overlooking the factor 2 in R} = p? + (2d)? in Eq. (7.82). The symp- 
toms in the latter case were an incorrect reactance, which was traced to incorrect [Z] 
elements; since the contributions from the scalar potential had already been validated, 
the error probably lay in the vector potential, and thus the bug was located. This lat- 
ter case is an example of a strange phenomenon of bugs: they are frequently located 
in some part of the code which should be very simple. Perhaps it is human nature 
to concentrate on the hard tasks and pay insufficient attention to the straightforward 
ones? 


Finally, validate your code carefully. This is very important, and we have emphasized 
this on several occasions. 


7.7 Further reading 


The development in this chapter is largely based on that of Mosig [1]. A similar, 
although not quite as comprehensive, treatment may be found in [14], and most 
of the key equations are also available in this source. Both of these contain quite 
extensive lists of references for further reading. For the specific development of an 
MoM code for microstrip antennas using the Sommerfeld potentials, these are the 
key references, containing a wealth of detail of implementation issues. Another 
contemporary publication was the monograph by Hansen [15]; this is somewhat 
more general in scope, addressing not just microstrip structures, but also compu- 
tational issues in detail. 

The formulation as discussed in this chapter addressed only single-layer 
grounded lossy dielectrics. It can be be extended to include multi-layer substrates 
and superstrates, with conductors of finite conductivity, so complex microstrip an- 
tenna arrays can be accurately modelled; details may be found in [1, 14]. (The 
half-space problem can also of course be addressed — this was the subject of 
Sommerfeld’s original investigations.) Microstrip antennas can be fed via feed 
pins, side feeds, or aperture coupling; the first two are readily implemented within 
the electric field MPIE MoM as in this chapter. It is possible to extend the formu- 
lation to include magnetic currents as well, which permits aperture coupling to be 
modelled efficiently. 

For other, more general, treatments of stratified media, Chew’s work is par- 
ticularly lucid [8]. Chew takes a slightly different approach, developing the 
Sommerfeld integral as a sum (spectrum) of cylindrical waves, and using plane- 
wave theory to handle stacked layers. His treatment is oriented more at buried 
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antennas (or scatterers) than microstrip structures, reflecting a geophysical back- 
ground. Ishimaru and Kong both provide coverage of these stratified media in their 
textbooks [16, 17]. (The transmission matrix formulation widely used for multi- 
layered media was formulated by Kong in an earlier book.) The latter is especially 
concise, perhaps too much so for introductory reading. Again, the emphasis is on 
half-space problems rather than microstrip structures. None of these references 
considers the numerical evaluation of the integrals in any detail. 

Work continues to be published on quite fundamental issues on this topic. Work 
on wires penetrating interfaces between different media was published by Burke 
and Miller [18] and was implemented in NEC-3 and NEC-4. An important general- 
ization of this was Michalski and Zheng’s work [19, 20], which permitted arbitrary 
conducting objects to penetrate the interfaces between dielectrics, using the RWG 
basis functions for the surface discretization. A very comprehensive invited review 
paper by Michalski on handling the “tails” of Sommerfeld integrals appeared quite 
recently [21]. Improved methods for efficient evaluation of the functions also con- 
tinue to appear [22]. Some aspects of the extension of the MPIE discussed in this 
chapter to problems involving both electric and magnetic surface currents are dis- 
cussed in [23]; an attractive feature of this treatment is that it permits very efficient 
modelling of slots in ground planes. 
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The method of moments and stratified media: practical 
applications of a commercial code 


8.1 Printed antenna and microstrip technology: a brief review 


Microstrip patch antennas are an example of a large class of modern antennas 
known as “printed antennas.” Microstrip was originally developed in the early 
1950s as a transmission line, and the first publication on using this structure as 
a radiator appears to have been by Deschamp in 1953 [1, Section 1.1]. Almost 
twenty years then passed until the first patent of the modern microstrip antenna 
was registered in 1973 by Munson, although the structure was independently dis- 
covered in at least one other location.! 

Microstrip antennas are generally constructed using the same photo lithographic 
process using to create printed circuit boards. In their simplest form, radiation is 
due primarily to energy leaking out of the cavity formed by the patch located close 
to a ground plane; physically, the patch is simply a very wide microstrip line. 
For the basic rectangular patch, the radiation from two opposite sides reinforces, 
whereas that from the other two sides cancels. The patch is usually supported on 
a dielectric substrate of some form, primarily for structural reasons. Typical ma- 
terials are Teflon and glass-reinforced plastics, as used in printed circuit board 
technology. Typical material properties for these are €, in the range from 2-2.5, 
and tan 6 from 0.0004—0.002. High-e, substrates such as alumina ceramics produce 
physically small patches, but with very limited bandwidth. Typical material prop- 
erties in this case are: €, 9.7—10.3, tand ~ 0.0004. For some applications, plastic 
foam substrates have been used. These materials (sometimes using cheap materials 
such as expanded polystyrene tiles) have properties close to free space: €- ~ 1.05, 
and tan dé ~ 0.0008. 


1 In 1972, at the National Institute for Defence Research, Council for Scientific and Industrial Research, Pretoria, 
South Africa. Unfortunately, the only references are internal classified memoranda and reports by C. A. van 
der Neut and A. Dubbelman. 
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Popular shapes are the original rectangular shape, which is still the most com- 
mon, as well as square and circular patches. Patches are usually fed either from the 
side, typically using a microstrip line, or from below, using either a feed pin (usu- 
ally the center pin of a coaxial cable) or aperture coupling. It is particularly easy 
to manufacture arrays using this technology (compared with wire antennas, for 
instance), since the corporate feed network can share the same substrate as the an- 
tenna. High-performance antennas usually split the feed network and the antenna 
onto two separate layers, to improve bandwidth and minimize unwanted radiation 
from the feed network. Even these are far easier to manufacture than a waveguide 
or wire array. 

The main advantages of the technology are the following: it can be readily in- 
tegrated with microwave circuitry; the antennas are flat, and can be conformed 
to surfaces, since the substrates can be moderately flexible; and it is at least po- 
tentially cheap, although high-quality substrates are not. The main drawbacks are 
limited bandwidth and power-handling capability. The former is the more serious 
problem in most applications and extensive research has focussed on the use of 
more complex geometries (doubled-stacked patches, for instance) in an attempt to 
increase this. 

To read more about this class of antennas, the very comprehensive introductory 
discussion in [2, Chapter 14] can be recommended. Coverage is also available in 
[3, Section 5.8]. For serious designers, [1] is essential reading. 


Figure 8.1 FEKO model of a rectangular patch antenna on a grounded substrate at Ag/15 
discretization. 
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8.2 A single patch antenna 


In this example, a simple patch antenna is analyzed. The antenna is fed from be- 
low using an offset “feed pin” — this a quite typical arrangement. The offset is 
used to obtain matching; the patch has its highest impedance at the edges, and 
lowest impedance in the middle. A rectangular patch will exhibit two orthogo- 
nal resonances, at frequencies where the length or width corresponds to ~0.48Aq 
(Aq = ./é Ao is the wavelength in the substrate dielectric). In this case, the feed 
is offset in the x-direction, so the relevant resonance should be expected at about 
ho © ./é, 2.08 - 31.18 mm, i.e. around 3.1 GHz. The geometry is illustrated in 
Fig. 8.1. It was generated using FEKO and is based on one of the examples shipped 
with the code. 


Modelling hints — microstrip antennas 


The PREFEKO model is shown in Fig. 8.2. A few points in this file require 
comment. Firstly, the feed pin must contact a node on the triangular mesh of the 
patch. This problem has been encountered before; the solution is explicitly to 
introduce a node on the patch at this point. (Once again, we comment that this is 
quite a general issue with MoM codes.) Half the patch is then generated using a 
triangle and a quadrilateral both of which include this feed pin node; the entire 
patch is then obtained by imaging in the y = 0 plane as usual (the feed pin lies 
on this plane of symmetry). 

The properties of the substrate are defined using the GF card. Here, we use 
the planar multi-layer option 10, for a grounded single dielectric substrate. The 
other parameters are comprehensively described in the FEKO manual and do not 
require further comment. 


Results for the reflection coefficient of the patch are given in Fig. 8.3, showing 
computations for both Ag/15 and Ag/25. (In Chapter 3, the antenna was also ana- 
lyzed using the FDTD, and it was noted that the resonance was just under 3 GHz 
for a converged solution.) The antenna is well matched at 2.97 GHz. Compared to 
our simple estimate above, this is an error of around 4%, but it should be empha- 
sized that that was a very crude approximation. The — 10 dB impedance bandwidth 
is about 100 MHz, or 3%. A simple formula for the bandwidth of microstrip anten- 
nas predicts a bandwidth of around h/Ao9, which corresponds well with this result 
for this h = 2.87 mm thick substrate at Ap ~ 100 mm. 


8.3 Mutual coupling between microstrip antennas 


In a number of practical applications, radiating antennas are located sufficiently 
close to one another that significant amounts of energy couple between antennas. 
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** Example30a: A rectangular patch antenna on a dielectric substrate with 


** a metallic ground plane (wire pin feed) 


** Scaling factor since all dimensions below in mm 


SF 1 0.001 
** Dimensions of the patch 

len_x = 31.18 

len_y = 46.75 

** Feed location and wire diameter 
feed_x = 8.9 


diam = 1.3 

** Substrate parameters 

h = 2.87 ** Height 

epsr = 2.2 ** Relative permittivity 


** Frequency (for the discretisation) 
freq = 3.0e9 

lam = 1000 * #cO / #freq / sqrt (#epsr) 
** Segmentation parameters 

P #diam/2 
** Generate one quarter of the structure 


** Define the points 


x = #len_x - #feed_x 
Pp 


D. A -#£feed_x 
DP B #x 

DP Cc #x 

DP D 0.0 

DP E -#feed_x 
DP N 0.0 

** Patch 

BT D B Cc 


BQ D Cc E A 
** Symmetry to create the full structure 


SY 0) 3 0 
** Feed wire with label 1 
LA 


BL N D 
** End of geometry 
EG 0 0 0 0) 


** Substrate (with groundplane) 


GF 0 dL: 0) 
#h 
** Voltage source at feed point 
A2 =, 1.0 
** Frequency loop in order to compute the impedance 
FR ch 0 2.8e9 
** Change the line above as shown below to run with FEKO LITE 
ee ER: 10 0 2.8e9 
** Just compute the impedance, no output of surface currents 
Os 0 
** Far-field pattern at centre frequency 
FR uf 0 3.0e9 
FF 1 h3 i: 
FF 1 73 1 
** End 
EN 


** Wavelength in mm 


#lam/15 


0.0 
0.0 
#len_y/2 
0.0 
#len_y/2 
0.0 


0 
#epsr 


0.0 


#lam/15 


oo oO Oo Oo 
Oo o-oo 


-#h 


0.0 


3.2e9 


3.2e9 


-#h 


Figure 8.2 PREFEKO file for the rectangular microstrip patch. 
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Figure 8.3 Reflection coefficient of the rectangular patch antenna for two discretizations. 


This is known as mutual coupling.” In a typical antenna array, this is an important 
parameter to establish, since it determines the active impedance — also known as 
the driving point impedance. This is the impedance at each port of the antenna, 
taking into account mutual coupling from all the other antennas. In a simple two- 
element array, the formula is: 


Vi 1) 
= 241+ Zi2— (8.1) 


a 
med 5 iA 


If both elements are fed with equal amplitude and phase excitations (i.e. y = I), 
the mutual coupling term Zj2 adds to the self-impedance term Z 2. Alternatively, 
if the antennas are not part of an array, but connected to different RF systems, 
mutual coupling can result in undesired energy leaking between the systems. This 
leads into the field of radiated EMC. 

Mutual coupling, in terms of voltage (or power) transfer, is complicated by pos- 
sible mismatches at both transmitter and receiver. The general formula is quite 
complex, but if both antennas are well matched (in the same Zo) then S12 (or S21, 
which is identical in reciprocal systems) is the voltage transfer ratio. This can be 


2 Mutual coupling is used for two related, but not identical, physical parameters. In the one case, it refers to the 
mutual impedance or admittance. In the other, it refers to the energy coupled from one port to another. The 
specific usage is usually clear from the context. 
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seen from 
V, = Suit + SiV3" = Si2V5"|s,,=0 (8.2) 


Since microstrip patch antennas are frequently used in an array, it is an inter- 
esting exercise to compute the mutual coupling. We are fortunate in that good 
measured data are available [4]. Jedlicka et al. measured the mutual coupling be- 
tween two patch antennas in both the E-plane (radiating edges adjacent) and the 
H-plane (non-radiating edges adjacent). The former results in far stronger cou- 
pling than the latter, so we will compute E-plane coupling. The elements were 
L = 10.57 cm (radiating edge) x W = 6.55 cm rectangular patch antennas. The 
substrate thickness was 0.1575 cm, with €r = 2.5. The loss parameter tan 5 was 
not specified, and we will assume it was negligible. The measured resonance fre- 
quency was 1.410 GHz. The patches were pin fed. The feed point impedance at the 
edge of a patch is quite high, and can be reduced by moving the pin a distance xo 
from the edge. This feed-pin offset was not specified in the original article, but can 
be computed as follows. The maximum resistance is approximated by Munson’s 
value: 


Rin © 60A9/W (8.3) 
and the input resistance at feed point position xo in from the patch edge is 
Re Rn cos” (=) (8.4) 


For this patch, Rp, ~ 195 Q and x9/L © 0.33 for a 50 & match. Since this is an 
approximate value, some fine-tuning is necessary with the simulation package to 
establish the optimal x9/Z as about 0.31. This produces a resonant frequency of 
fr = 1.425 GHz, around 1% higher than the measured center frequency. Such dif- 
ferences are very common for narrowband structures; the most probable source 
of error is uncertainty of the exact value of €r, which is usually easily of this 
order unless very high quality (and hence expensive) substrates are used. Fig- 
ure 8.4 shows the computed reflection coefficient. Results are given for both the 
isolated element case here, as well as the array case, with another patch one wave- 
length away (terminated in a matched load). As before, the predicted bandwidth 
of around 0.7% agrees quite well with the computed —10 dB bandwidth of just 
under 1%. 

With the design of the basic patch finalized, the patch is replicated to generate 
another patch (Fig. 8.5). 
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Figure 8.4 Reflection coefficient of the rectangular patch antenna in [4]. 


Figure 8.5 FEKO model of the two-element rectangular patch antenna array as in [4], for 
1X spacing. Only the patches are shown. 
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Modelling hints 


As can be appreciated from the preceding material on the Sommerfeld poten- 
tials, the computational cost of this formulation is quite high. Symmetry should 
be exploited as far as possible to reduce this. For this E-plane coupling prob- 
lem, there is a plane of magnetic symmetry in the plane containing the feed pins. 
Note that many codes support both geometrical modelling, which is largely a 
modelling aid, and does not usually significantly reduce computational cost,@ 
as well as field (electric or magnetic) symmetry, which does. For this example, 
using symmetry correctly reduced the number of unknowns and the memory re- 
quired by a quarter. Unfortunately, the H-plane coupling has no field symmetry, 
since the feed pin is offset in this plane. 

Most simulation packages have the ability to copy parts of the geometrical 
model. In FEKO, this is done using the translate geometry facility. 


“ An exception is the present case of the Sommerfeld potentials, where using geometrical symmetry can speed 
up the matrix fill significantly. 


Computing the mutual coupling is a little tedious; one specifies the inter-element 
spacing, runs the code at f,., extracts Sj2 and then repeats the process for the next 
spacing. Results computed using FEKO with a 49/15 discretization are given in 
Fig. 8.6. (Note that the distance referred to here (and throughout this section) is the 
distance between adjacent edges, as in [4], rather than the inter-element spacing of 
array theory.) A convergence check was performed on the D = 0.2A9 case using a 
Ao/25 mesh which confirmed that A9/15 is quite adequate. There are differences 
between the measured and computed data, at most around 2 dB, but this is to be 
expected. One reason for this discrepancy is the sensitive nature of this parame- 
ter. Figure 8.7 shows S2 as a function of frequency; clearly, very small changes 
in frequency can easily result in the type of discrepancy noted in Fig. 8.6, in ei- 
ther measurement or computation. Another possibility is the experimental setup, 
whereby dielectric spacers were inserted as the inter-element spacing increased; 
this is clearly only an approximation of a continuous substrate. Finally, data for 
the same problem computed by Mosig et al. [5, Fig. 8.27] also show differences 
of a similar type between measured and computed data, although in their case the 
agreement is better in some places and worse in others compared with our sim- 
ulation. Their code used entire domain basis functions, so the numerical results 
cannot be expected to be identical. 

For typical narrowband broadside patch array designs, the mutual coupling lev- 
els are relatively small, as we have seen, and may be neglected, a result which 
rather surprised antenna designers — who were used to the much higher levels of 
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Figure 8.6 S12 for the rectangular patch antennas in the text. Measured data from [4]. 
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Figure 8.7 E-plane mutual coupling between two patches, one wavelength apart, showing 
strong frequency dependence. 
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mutual coupling in wire or slotted waveguide arrays — when microstrip patch ar- 
rays were first developed [6, p. 270]. This is not however true of arrays using thick 
substrates and/or high dielectric constants, since surface waves can be strongly ex- 
cited, resulting in higher levels of coupling. It is also not true of phase scanned 
arrays, the topic of the next section. 


8.4 An array with “scan blindness” 


The elementary theory of phased arrays can be found in almost any book on an- 
tennas. By adjusting the relative phase between array elements, the position of 
the main lobe (and of course the side lobes) can be moved; if the phasing can be 
changed (either manually or electronically) the beam can be “steered.” Phased ar- 
rays, as such antennas are called, were a crucial defense technology throughout 
the Cold War, with one of the most dramatic examples of the technology being 
the DEWS (Distant Early Warning System) radars deployed by the USA to warn 
of ICBM attack. More recently, “smart” antennas also exploit this effect, although 
usually to move nulls to cancel undesired signal sources rather than position main 
beams to detect targets. 

In practice, however, arrays can exhibit an effect called “scan blindness,” which 
few textbooks discuss, [3, p. 470] being an exception, since the effect is not pre- 
dicted by simple antenna theory. Scan blindness occurs at a specific angle (or 
angles), and at this angle the antenna becomes extremely badly matched, radiat- 
ing essentially no energy. Different types of arrays can suffer from this, including 
waveguide and wire arrays, and also printed arrays such as microstrip patch arrays. 
The common factor in the scan blindness phenomenon is a structure near or on the 
array face capable of supporting a slow wave; a slow wave is one whose phase 
velocity is much less than the velocity of light. (Classic examples are helices, cor- 
rugated surfaces and grounded dielectric slabs.) TM and TE surface waves have 
already been discussed, so it is not surprising that microstrip arrays can suffer 
from this. For printed antennas, two papers by Pozar and Schaubert [7, 8] are the 
key references, with a comprehensive exposition of the problem supported by re- 
sults computed using one of the earlier MoM codes able to handle this type of 
antenna. 

Strictly speaking, scan blindness only occurs in infinite arrays, but in sufficiently 
large finite arrays, the effect in practice is the same: a very poorly matched antenna 
which hardly radiates. It is possible to formulate the problem in the spectral domain 
to produce an infinite array [7, 8], but most commercial codes cannot do this. To 
demonstrate the effect, we will study a large array of thin printed strip dipoles. 
We use this structure, rather than patches, to permit a larger array to be simulated. 
Symmetry should also be used as far as possible to increase the effective array size; 
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Figure 8.8 256 element printed dipole array. 


unfortunately, the phasing of the feeds required to scan the array limits the use of 
symmetry. 

An example of an array produced in FEKO is shown in Fig. 8.8. Each element 
is a strip dipole, length L = 0.39A9 and width W = 0.002Ao, with substrate thick- 
ness h = 0.199 and relative permittivity €r = 2.55, as in [7]. 


Modelling hints — generating a large array 


Generating the array can be an exercise in programming; in FEKO, perhaps the 
simplest approach is to use two nested FOR loops, the inner loop generating the 
dipoles in the E-plane, the outer loop generating “lines” of these dipoles. The 
key loops are shown in Fig. 8.9; the variable #a is the inter-element spacing, and 
#N is the square root of the number of elements — the array is square. Similar 
ideas could also be used in other simulation packages supporting some form of 
scripting. 


It is interesting firstly to study the effect of the array environment on the element. 
The concept of active impedance has already been introduced in Eq. (8.1) for two- 
element arrays. For an N element array, the active impedance of element i is 


Za = 


v 


ss ge” a 

L 

4 =Z;+ SY) Zij- (8.5) 
lj = lj 
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yo = #a/2 
!! FOR #j = 1 to #N/2 ** Outer loop 
xc = (-#N+1)/2*#a 
!! FOR #i = 1 to #N ** Inner loop 

b = (2*#N) * (#j-1)+2*#i-1 
** Generate the strip dipole antenna 
DP A xC #yc-#w/2 h 
DP B XC+HL/2 #yc-#w/2 h 
DP Cc XC+H#HL/2 #yc+#w/2 h 
DP D xC #yct+#w/2 h 
LA #1b 
BP A B e D 
DP E xc #yc-#w/2 h 
DP F xC-#L/2 #yc-#w/2 h 
DP G xC-#L/2 #yc+#Ww/2 h 
DP H xC #yct+#w/2 h 
LA lb+1 
BP E F G H 
xc = #xc+#a 
!!NEXT 
yo = #yc+#a 


!!NEXT 
SY 0 3 0 #N72 


** Set up array feeds. 

!! FOR #k = 0 to 10 ** Start of phase angle loop 
#thet = RAD(0+#k*5) ** scan angle theta in radians 
#delfz = #k_0 * #a * sin(#thet) 

#1b1 = 
#1b2 = 2 
** Impose progressive phase shift in voltage in E-plane (phi=zero) 
!! FOR #3 = 1 to #N ** Outer loop 


#phs = 0 ** re-set phase to zero for each constant-y iteration 


!! FOR #i = 1 to #N ** Inner loop 
!lIF (#j = 1) and (#i=1) THEN 


ak This is the first feed point, new feed (to zero all others). 
AE 0 #1b1 #1b2 0 1.0 DEG(#phs) 75 
!!ELSE 


ae Additional feedpoints - add to sources. 

AE 1 #1b1 #1b2 0 1.0 DEG(#phs) 75 
!! ENDIF 

#phs = #phs + #delfz 

#1b1 = #1b1+2 

#1b2 = #1b2+2 

!! NEXT 

!! NEXT 


Figure 8.9 Key components of the PREFEKO file used to generate the printed dipole array. 
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Figure 8.10 Reflection coefficient (in a Zy = 75 Q system) versus frequency for both an 
isolated element and a central element in a 16 element array. 


When array feeds are used, this is the impedance automatically computed by 
codes such as FEKO — although it is not explicitly called the active impedance. 
In Fig. 8.10, the reflection coefficient of an isolated element is compared with that 
of an element in the center of a 16 element array. (The elements were discretized 
at around A9/50 for this figure.) Note that the resonance frequency moves upwards 
by around 10% due to the array enviroment. To compute this result, voltages of 
the same magnitude and phase were applied to each element. Note that this does 
not guarantee a uniformly illuminated array! The reason is that it is the currents 
which determine the radiation pattern, and since the active impedance differs from 
element to element, so does the resulting current. 

Now, the effect of scan angle can be determined. For an m x n array of sources, 
to scan a beam an angle 6,, @s off broadside requires that the m, nth source should 
be phased as 


elko (ma sin 6; cos gs +nb sin Os sin ds ) (8.6) 


This assumes that the array axes are aligned with the x- and y-axes, as in Fig. 8.8, 
and that the spacing along these axes is a and b respectively. For reasons discussed 
in detail in [7], only the E-plane scan (the x—z plane in Fig. 8.8, i.e. @ = 0) exhibits 
scan blindness, and our simulation will only investigate this plane of scan. We also 
assume that the inter-element spacings are equal in both planes, that is, a = b. 
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Figure 8.11 Reflection coefficient versus scan angle for a 256 element array and an infinite 
array; the latter data are from [7]. 


Hence the progressive phase advance (or delay) to add to each element in this 
plane is 


A = koa sin 0, (8.7) 


No progressive phase shift is required in the other plane. 

The results of a simulation of a 256 element array are shown in Fig. 8.11. The 
inter-element spacing is a = 0.5A9 in both planes. Except where mentioned, the 
substrate was assumed lossless. These results were computed using a Ag/25 dis- 
cretization, which gave acceptable accuracy. For comparison, results computed us- 
ing an infinite array code [7] are also plotted. The agreement is suprisingly good, 
and demonstrates how scan blindness can impact on a finite antenna which is of 
a quite practical size. (Although not shown, an 8 x 8 array gives a similar result, 
although the reflection coefficient peak is somewhat lower.) Note that, as in [7], all 
reflection coefficients for this array are referred to a Zp = 75 Q system. 

Radiation patterns for scan angles of 40°, 45° and 50° for this 256 element 
array are shown in Fig. 8.12. (The phasing actually produces a scan angle of —40°, 
—45° and —50°; we note this and do not mention it again.) For this computation, 
a small amount of loss was added to the substrate; tan 6 = 0.002 was used, which 
is representative of a good low-loss Teflon-fiberglass substrate. Figure 8.12 plots 
gain, so substrate loss is taken into account; it is clear that the array works very 
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Figure 8.12 E-plane radiation patterns for scan angles of 40°, 45° and 50°, showing scan 
blindness at 45°. 


poorly at 0; = 45°. In reality, the situation is even worse, since the gain G does not 
take into account the mismatch loss (1 — |I"|*) which the antenna presents to the 
source. The literature on antennas does not seem to adopt a consistent approach to 
incorporating this effect; some authors [2] incorporate it into antenna efficiency. 
The product of G(1 — ||?) is also sometimes referred to as realized gain. At the 
blind scan angle,  ~ 1 so the product G(1 — ||) is almost zero. At 6, = 50°, 
the pattern has improved again, although the peak gain is not quite as large as at 
40°. The reason for this is no doubt that the magnitudes and phases of the element 
currents on the outside of the array differ significantly from those of the central 
elements due to the different active impedances, and this effect becomes more 
pronounced as the scan angle increases off broadside. 


Modelling hints — array feeds 


When modelling an array, the feeds need to imposed. Most MoM codes permit 
a number of sources to be used. Most sources are essentially impressed volt- 
age feeds: a voltage is specified at each feed point — or port in network theory. 
The code then computes the resulting current, and from this, the impedance at 
the port. Multiple feeds simply augment the right-hand side (or forcing) vector 
{V} of the generalized MoM impedance matrix [Z]{/} = {V}. Note that active 
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impedance calculations, unlike mutual power coupling ones, are not affected by 
the terminating impedance(s) at the other ports. 

If some type of loop structure is used to apply feeds — for instance, for different 
scan angles — it is very important to ensure that all the previous feeds are zeroed. 
How this is done varies from code to code; in FEKO, one tags the first source as 
a new source and the code zeros all previous ones. 


Modelling hints — a useful equivalence for strip dipoles 


A strip dipole is often very thin in comparison to its length, and currents are thus 
essentially constrained to flow along its length in much the same way as with 
a wire dipole. In this case, it is possible to model a strip as a thin dipole. For a 
strip width W, the equivalent wire radius a to use isa = W/4 [3]. In the case just 
discussed, we chose to model the strip with surface patches, but using equivalent 
dipoles would be an option worth pursuing. 


8.5 A concluding discussion of stratified media formulations 


The printed dipole array example concludes this chapter on the practical applica- 
tion of Sommerfeld potentials. Before leaving this topic, it is worth briefly men- 
tioning the issue of memory requirements and run-times. For the former, there 
is little overhead when using the Sommerfeld formulation, since the memory re- 
quirement is still dominated by the matrix storage, which remains N7, with N 
the number of unknowns. For the 256 element array, the FEKO simulation with 
h/do = 1/25 had 4864 basis functions, but the use of symmetry resulted in only 
2432 unknowns. This required around 183 MB of RAM to store. The statistics for 
run-time are interesting: this is a moderately large problem in MoM terms, and in 
a free-space environment one would expect the matrix solution time to start dom- 
inating the run-time. In this case, however, using the Sommerfeld potentials, the 
time required to compute the impedance matrix elements exceeded the time re- 
quired to solve the linear system by a factor of around fifteen.* By comparison, for 
a free-space problem with the same number of unknowns (and the same memory 
requirement), the ratio was around two and a half. Note that we are comparing run- 
times for problems with the same number of unknowns, to get an idea of the cost of 
the Sommerfeld potentials compared to the free-space Green function. This is not 
equivalent to running the simulation with a grounded substrate with €r = 1, ie. 
vacuum. In this case, one needs to image the patches and the feeds in the ground 


3 Actual “wall clock” run-times are so dependent on computer technology that, in common with much of this 
book, we prefer to use ratios where possible. 
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plane, using symmetry, so the equivalent problem will have more unknowns. 
Hence for the same physical problem of a grounded “dielectric” (actually vacuum) 
slab, the Sommerfeld formulation is actually little more costly.* Of course, this 
is not relevant in practice, since most substrates have dielectric constants signifi- 
cantly larger than unity, and there is no alternative but the Sommerfeld approach. 

Summarizing the development in this and the preceding chapter, stratified media 
MoM formulations are theoretically complex, challenging to implement but poten- 
tially very efficient. This is largely due to only the metallic regions of the antenna 
(wire, patch, feed network etc.) being discretized — hence quite large microstrip an- 
tennas can be modelled. In the context of RF and microwave engineering, the most 
important contemporary application of this theory is to printed antenna technology, 
of which microstrip is the most commercially important type, and our examples 
have concentrated on this technoloqy. Historically, terrestrial broadcasting, espe- 
cially LF, MF and HF was another important application — indeed, this prompted 
Sommerfeld’s original work — but with the exception of some specialized mili- 
tary systems, this is hardly a dominant technology at present. Subsurface imaging 
is another significant contemporary application; however, real grounds are not al- 
ways well stratified, and even if so, the stratifications may not be parallel with the 
ground—air interface. 

In concluding this chapter, some final points should be noted. Firstly, the 
Sommerfeld-MoM assumes an infinitely large substrate on a similarly infinite 
ground plane. Hence, such MoM programs do not provide any information 
about the effects of finite substrates/grounds. Also, many programs based on the 
Sommerfeld potentials are not truly general purpose. There are theoretical reasons 
for this: the near-fields are typically obtained via interpolation tables, the far-fields 
via asymptotic integrals, which may neglect some terms. Using such a program, 
especially for fields very close to interfaces, may result in anomalies; see for exam- 
ple [9]. However, for the purpose most commercial codes are designed for, usually 
microstrip and printed structures, the codes are generally robust and accurate. 
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9 


An introduction to the finite element method 


9.1 Introduction 


The finite element method (FEM) is one of the best-known methods for the so- 
lution of partial differential equations in applied mathematics and computational 
mechanics. It is a method for solving a differential equation subject to certain 
boundary values, and in its modern form originated in the field of structural me- 
chanics during the late 1950s; the first specific usage of the term “element” is 
due to no lesser a person than Courant. In common with the MoM, its histori- 
cal antecedents are far older than this, in this case dating back to the nineteenth 
century and the variational methods first described by Lord Rayleigh. It is very 
widely and routinely used in structural mechanics today, as well as in compu- 
tational fluid dynamics, computational thermodynamics, the numerical solution 
of Schrédinger’s equation, field problems in general, and of course, in electro- 
magnetics. 


An historical aside — Courant and the finite element method 


The finite element method as presently accepted can be credited to Courant — 
whom we have already encountered in the context of the Courant limit for the 
FDTD method. The published version of his 1942 address to the American 
Mathematical Society contained an appendix added after the talk, to show by 
example how variational methods could be put to wider use in potential theory. 
He used piecewise linear approximations, on a set of triangles which he called 
“elements” — and thus the method was born [1, p. 5]. 


With the background we have now acquired with the FDTD and MoM, read- 
ers will recognize many features in common with both of these methods in the 
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treatment to follow; indeed, they will probably not be surprised to learn that both 
can be formulated within an FEM setting. In common with the MoM, the core 
idea is to replace some unknown function on a domain by an ensemble of ele- 
ments, with known shape but unknown amplitude. Unlike the basic FDTD, where 
the approximation of the E and H fields is always done on a rectangular, stag- 
gered grid, the FEM permits very general geometrical elements to be used and 
(usually) only uses one grid. The most widely used elements are known as sim- 
plicial — this simply means line elements in 1D, triangular in 2D and tetrahedral 
in 3D. Nonetheless, rectangular, prismatic and even curvilinear elements also find 
widespread application. Since the improved geometrical modelling made possible 
especially by triangular or tetrahedral meshes is one of the major features distin- 
guishing the FEM from the FDTD, our study of the FEM will be restricted to these 
elements. Interested readers may find treatments of other element shapes in the 
references. 

Similar to the FDTD, but unlike the MoM, the FEM is based on a local descrip- 
tion of the field quantities, derived from the differential equation description of the 
Maxwell equations, and does not automatically incorporate the Sommerfeld radi- 
ation condition.! In practice, this means some form of mesh termination scheme 
is required. The easiest is usually an absorbing boundary condition of some type. 
(However, it is also possible to use an “exact” termination scheme using the MoM 
on the boundary. This is covered in Chapter 10.) In common with the FDTD, and 
due to the differential equation basis of the two methods, the FEM permits very 
straightforward treatment of material discontinuities. 

The FEM was first applied in electromagnetics during the late 1960s, at much 
the same time as the initial work using the MoM and FDTD. The two earliest ap- 
plications were independently published by Silvester, and by Arlett, Bahrani and 
Zienkiewicz. Some of the history of the FEM in electromagnetics may be found 
in [1, p. 5]. However, this promising start was arrested during the 1970s and early 
1980s because of a problem called “spurious modes” which, combined with sub- 
stantial computational cost and complex coding, held back widespread adoption 
of the FEM in electromagnetics. Fortunately, there was a major theoretical break- 
through with edge elements in 1980s, which led to a far greater understanding 
of the spurious mode problem, and the introduction of largely effective solutions. 
This improved theoretical understanding, combined with the widespread availabil- 
ity of very powerful computers, and increasing interest in wave interaction with 
non-metallic structures, has made the FEM a major analysis tool of contemporary 
CEM. 


! This should not be confused with the Sommerfeld potentials for stratified media. 
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9.2 Variational and Galerkin weighted residual formulations: 
the Laplace equation 


9.2.1 The weighted residual approach 


The FEM can be derived via two different, but equivalent, procedures. On the one 
hand, there is the Galerkin weighted residual formulation, already encountered in 
Chapter 4.” On the other hand, there is the variational approach (or more fully, the 
variational boundary value problem). The latter is used by most textbooks. The 
former is more direct at the formulation level, but incorporating the boundary con- 
ditions is somewhat less obvious. We will discuss both approaches in this chapter. 

As usual, we will first illustrate the ideas with a simple example. Consider the 
following partial differential equation (PDE) in two dimensions: 


V-eVd =0 (9.1) 
For linear, isotropic media, we have € = €,€, and this is equivalently 
V-6-Vo=0 (9.2) 
In a materially homogeneous region, this reduces to the Laplace equation: 
V-¢ =0 (9.3) 


With a PDE, boundary conditions must of course be specified. For a second- 
order PDE such as this, the following on the closure (boundary) are necessary and 
sufficient for a unique solution. 


e A value of function ¢ is specified — this is a Dirichlet boundary condition. If ¢ = 0, this 
is called a homogeneous boundary condition. 

e A value of the normal derivation, se is specified — this is a Neumann boundary condi- 
tion. Again, if oe = 0, this is called a homogeneous Neumann boundary condition. 

e A linear combination of the above is specified — eit + y¢=q. This is known as 
a mixed boundary condition (also sometimes as a Cauchy boundary condition); the 
Neumann boundary condition is a special case of this with y = 0. 


Note that these may be mixed in any ratio along the boundary: the boundary may be 
entirely Dirichlet, or entirely Neumann,’ or entirely mixed, or some combination 
of these along different sections of the boundary. However, they must be disjoint — 
that is, more than one may not be simultaneously specified along the same part of 


the boundary. 


2 Readers who are not working through this book sequentially might wish to read Chapter 4 at this stage. 
3 Note that for higher-order PDEs, additional boundary conditions are required. 
4 Tn which case, the PDE can be solved only to within an unknown constant. 
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In terms of our previous work on the method of weighted residuals in Chapter 4, 
the linear operator for the Laplace equation is £ = V7-; the unknown function 
f = ¢ and the forcing function g = 0. (Again, in this case, the mathematical term 
homogeneous is sometimes used, this time in the context of the PDE.) We proceed 
as with the MoM, by introducing basis functions, weighting functions W and an 
inner product. The unknown function (the potential ¢ in this case) is expanded as 


N 
$2 Dl anhn (9.4) 


i=1 


Suitable weighting functions w,, are introduced: 


M 
W =) un (9.5) 
m=1 


and an inner product is defined for this two-dimensional problem as 


(a,b) = // ab dS (9.6) 
S 


Hence, as before, a linear system is obtained, with entries of the following form 
for the m, nth system matrix element: 


(wm, Lanhn) (9.7) 


At this stage, this looks so similar to the MoM that one might wonder why the 
FEM is regarded as a different method. (Indeed, a number of workers in the 1980s 
tried to unify the methods thus.) Although in general terms there are indeed simi- 
larities at this very fundamental functional analysis level, in practice there are great 
differences which lead to different algorithms being required. The most important 
is that the operator £ is now a differential as opposed to an integral operator; this 
means that only elements in close geometrical proximity have non-zero system 
matrix entries, and hence a very large number of the matrix entries are zero. Math- 
ematically, this is a sparse matrix; the MoM with integral equations generates full 
matrices. Another important difference is that with the MoM integral equation for- 
mulation, the boundary conditions are built into the formulation; with the FEM, 
these must be explicitly imposed (and we have not discussed at all how to do this). 

In short, the devil is in the detail, which we must now address, and to do this 
it is convenient to use the variational approach, the topic of the next section. But 
first, some finite element terminology will be introduced. With finite elements, 
we usually employ basis functions which span only a small part of the domain — 
subsectional as opposed to entire domain, in MoM parlance. This region is gen- 
erally known as the element, and the basis function is also frequently called the 
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shape function. The term elemental function is also sometimes encountered. (It is 
frequently thought that the term “finite element’ comes from this geometrical de- 
composition into finite regions — as opposed to infinite elements, which are also 
sometimes used — but it has also been attributed to the finite energy in an element.) 
As with the MoM, a variety of shape functions have been used. Generally, the 
most useful are polynomial interpolation functions — although shortly we will see 
another type of incomplete polynomial function, which is not interpolatory, but is 
very widely used, namely the edge-based element. 


9.2.2 The variational approach 


The equivalent variational functional 


At this stage, we are going to look at the finite element method from a different 
viewpoint, namely that of the variational functional approach. Instead of directly 
solving Eq. (9.1), we are going to work with an equivalent problem, namely an 
energy functional, whose minimum corresponds to the solution of the PDE. For 
Eq. (9.1), a suitable functional is: 


1 
we)=35 / / <(Vo)2dS (0.8) 


We state this without proof for the present — subsequently we will return to this, 
since the proof yields important information about the boundary conditions. We 
note that this is the energy 5 Lf D - EdS. We also note that the function @ in the 
original equation had to be at least twice differentiable; in the above, it need only be 
once differentiable. Due to this “weakening” of the requirements on the function, 
this is sometimes called the weak formulation. For a linear, isotropic medium, we 
have € = €,€g and since we are eventually going to set the derivative of W to zero, 
we can just as well divide out by €o at this stage, leaving only the €, term: 


1 
WO) = 5 / / ,(V)2 a5 (0.9) 


The shape functions 


In one dimension, the only choice to make is the shape of the basis function, but 
in two and three dimensions, we can choose both the shape function and the ge- 
ometrical shape of the element. The most popular choices in two dimensions are 
triangular and quadrilateral elements; for reasons already discussed, we will fo- 
cus on triangular elements, although we will use rectangular elements to introduce 
some ideas regarding vector elements. Assuming that the geometrical region (the 
domain) has been decomposed into elements — later, we will discuss ways of doing 
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this — we note that Eq. (9.8) is valid within each element, and in the following we 
will initially focus on this energy functional on an element-by-element basis. 

Zero-order elements (the equivalent of the pulse basis functions we used for the 
first MoM example back in Chapter 4) cannot be differentiated even once, so are 
not admissable in this problem. Hence, we will start with first-order elements. In 
this case, the approximating function can be written as 


bX a+bx+cy (9.10) 


The constants a, b and c are, of course, what we require the FEM eventually to 
compute for us. However, it is more convenient to write this in a form where the 
unknowns are the potentials at the three triangle nodes, or in other words: 


bY a(x, yi + a2(x, y)b3 + a3 (x, y)b3 (9.11) 


This assumes the existence of suitable functions a(x, y), a@2(x, y) and a3(x, y); 
their properties will emerge shortly. 
Noting ¢; = a+ bx, + cy, and similarly for the other two nodes, we have 


PI Le “iy a 
éd2|=]1 x yo b (9.12) 
$3 1 x3 y3]| Le 


Inverting the nodal coordinate matrix, we find: 


a 1x wy | 
b{|=|1 x. y p2 (9:13) 
c 1 x3°93 $3 
Now we have: 
=I 
lox. y P1 
@=[lxy]]1 x2 ye 7) (9.14) 
1 x3 y3 $3 
which may be rewritten as 
3 
=) > piai(x, y) (9.15) 


i=1 


with 


1 
ay = 54 O27 = %9y2) 2 = 3) % Aes 22) 9] (9.16) 
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and A the triangle area (which is conveniently half the determinant of the nodal 
coordinate matrix).> The other functions a2 and 3 are obtained by cyclic inter- 
change of the indices, modulus three. 

Note that the functions a; are interpolatory on the three vertexes (nodes): i.e. 
unity at node i, and zero at the other nodes. (Once again, we caution that not all 
the finite elements we will study have this property.) 


Manipulating the energy term 


Substituting Eq. (9.15) into Eq. (9.8), the following is obtained for the energy in 
an element e: 


1 
was ff <Vo- Vd 
DN de 


3 3 
= die Lf, va varas| ; (9.17) 
i=1 j=1 


where we have now assumed that the permittivity is constant within element e. 
This is very compactly written in matrix notation as: 


1 
We = s{b}" eS} (9.18) 
with {¢} the vector of nodal potentials and 
=f] Va; - Va; dS (9.19) 
Se 


This matrix is often called the stiffness matrix, from the structural mechanics origin 
of the method, but this has no physical meaning in electromagnetics and we will 
not use this term frequently. (In this chapter, we will use [S] for this matrix, and 
[7] for another frequently encountered matrix. This notation is due to Silvester 
and Ferrari [2]. Unfortunately, there is no standard notation in this regard in the 
literature. Peterson, for instance, uses [E] and [F] respectively [3], as does Jin 
[4].) The expressions are simple to evaluate, for example, 


1 
12 = ga 2 — yada — yi) + a — x2) — 43) (9.20) 


Connecting the elements 


At this stage, we have worked in isolation, considering the element on its own. 
Each element has nodes numbered /ocally from one to three. In practice of course, 
there will be a (perhaps very large) number of elements, with nodes numbered 


5 Note here that A is a signed quantity, whose sign depends on whether the nodes are numbered clockwise or 
anticlockwise. See Section 9.7.2 for further discussion. 
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2 4 2 


Figure 9.1 Two triangular elements, disconnected (left) and connected (right). 


according to some global numbering scheme. (It is worth commenting that map- 
ping local to global information, and vice versa, usually requires a significant 
amount of book-keeping in the average FEM code.) We need some method to con- 
nect the elements; various approaches are available. At present, we will assume 
the existence of a connection matrix which tells us how to map the unconnected 
nodes to the connected mesh. As a simple example, see Fig. 9.1, which shows two 
such triangles. The connection matrix for this system is 


1 
re | 
[C]= | | (9.21) 


| | 
Ll 


{dais} = LC ]{Pcon} (9.22) 


and thus 


with 
{pais} = {1 $2 $3 $4.05 bo} is 
{peon} = {1 $2 $3 P4}oon (9.23) 


Although this may be belaboring the obvious, this connection matrix ensures that 
the potential at each node is the same on all elements sharing that node. (This 
seems simple and obvious, but we will see that in the context of vector fields, this 
may not always be desirable.) 

Using this, the resulting equation for the energy in the whole system is: 


1 T 
WwW = 3 {Peon} [S]{Pcon} 


[S] = [C]’ [e? Suis ]EC] 
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The term [€¢ Sais] requires that the stiffness matrix of each element be multiplied by 
the relative permittivity associated with the element before one proceeds further; 
it is included in the overall [S] matrix for the problem. 

However, the formulation is not completed yet. It will be recalled that it is the 
solution which minimizes the variational functional which corresponds to the so- 
lution of the PDE, and all we have here is an expression for the energy in the 
connected elements. We must now establish this minimum. In doing this, we need 
to distinguish formally between free (f) and prescribed (p) potentials here. The 
latter are those prescribed by the Dirichlet boundary conditions. The former are the 
degrees of freedom (the unknowns) in the problem. It is convenient if we choose to 
number first the free and then the prescribed potentials; this can be done relatively 
easily, even for moderately complex geometries. 

Differentiating with respect to the free potentials, and setting the resultant ex- 
pression to zero, one obtains 


aw a Srp Sp Oa 

any (ors tl P = (0) (9.24) 
dpe} — AHF} ( PPE | Soe Spl. op 

Expanding the quadratic, differentiating, and then using [S¢p)] = [Spr], yields: 


[Srp Spp] \ =0 (9.25) 
or, more conveniently, 


[Srpltos} = —[S pp lop} (9.26) 


Once again, this is a system of linear equations which can be solved using stan- 
dard techniques. Here we should note that the matrices [Sf] and [Sp] are sparse, 
containing only entries where nodes are shared by elements; for initial implemen- 
tation work we need not exploit this, but FEM codes for practical applications 
must, or much of the benefit of the FEM is lost. Note also that these terms include 
e,; in the S matrix elements as above. 


A mathematical aside — partial differentiation of matrices and vectors 


Since the free potentials are most conveniently written as a vector, it is useful to 
note that vectors can be differentiated much as scalars, viz. 
aC {x} _ 
O{x} 
d{x}" [A] a} 
d{x} 


= 2[A]{x} 
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etc. In the above, C is a scalar constant and [A] a constant matrix. This re- 
sult greatly simplifies the analytical work required in minimizing the functional. 
Such identities can be proven by expanding the vector expression into its com- 
ponents, and then differentiating with respect to each of them in turn. A good 
reference to read more on this topic is [5, Appendix B]. 


Coding hints - FEM data structures 


Note that in practice one rarely numbers all the nodes in an unconnected 
fashion first; instead, node 4 on the right-hand triangle would probably be 
referenced using some data structure of the form element (m) Snodeone, 
with m the element number, in a language such as FORTRAN 90 which sup- 
ports derived data types — i.e. objects of a type defined by the user. The % 
in FORTRAN 90 is a component selector, and returns the component called 
nodeone from the mth entry in derived data type element. In MATLAB 
(which does not support this type of derived data structure), one might have 
a variable named element_nodeone (m) , or use a two-dimensional array of 
the form element_nodes (m, local_node_num) ; there are a variety of pos- 
sibilities. 

Furthermore, even if used, the connection matrix is also not stored as ex- 
plained here; the reason is that it is highly sparse and could be stored far more 


efficiently in some type of compressed storage scheme. 


9.2.3 Some practical issues: assembling the system 


In FEM parlance, the process of filling the finite element system matrix is fre- 
quently known as matrix assembly. For practical codes, it is generally convenient 
to loop over the elements rather than the nodes (recall that the degrees of freedom 
are the nodal potentials for this first-order scheme). This is known as assembly 
by elements. For a particular global degree of freedom i, any element which con- 
tains this node will contribute to the matrix. For triangles, this number depends on 
the mesh. We will now discuss two practical methods which simplify this matrix 
assembly process. 


Connecting the system 


The connection matrix is useful for explaining the method, but inconvenient. Prac- 
tical programs do this essentially by inspection. A global numbering system is 
adopted from the start. As each element’s [S] matrix is computed, it is entered 
into the global matrix. A formal method for doing this has been described in 
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[2, pp. 51-53], but essentially one simply adds the contributions of each elemen- 
tal matrix at the appropriate global row and column entry. Once again, note that 
sparsity has not yet been exploited. 


Handling the boundary conditions 


Repeating the matrix equation to solve, 


[Srp ltor} = [Sfp l{bp} 


we see that the prescribed boundaries form the right-hand side of the matrix equa- 
tion. The easiest approach is to number free unknowns first, then prescribed un- 
knowns, as already briefly mentioned. Entries of the form S + (i.e. both nodes free) 
are entered into the system matrix; entries of the form Sf, (i.e. one node free) are 
multiplied by the prescribed potential and entered into the right-hand side vector. 
Entries of the form S,¢ and Sp» play no role. (Actually, [S mele = [Sfp] and this 
is implicitly included during the minimization process, when this is exploited.) 
Another method has been described in the literature [2, pp. 49-50] which is 
useful when it is not possible, or very inconvenient, to number first free then pre- 
scribed elements; it uses dummy entries, and increases the matrix size slightly. 


9.2.4 More on variational functionals 


Earlier, we mentioned that the equivalence between the PDE and the variational 
functional lies at the heart of the variational FEM approach. Having now seen a 
basic FEM formulation developed, we need to return to the theoretical underpin- 
nings of the method. We will work with the more general Poisson PDE, which 
includes a source term, for a homogeneous region: 


Ves” (9.27) 
€ 

where p is the source, and the boundary conditions on $ = S; + So, as before, are 
Dirichlet on S; and Neumann on S2. There does not appear to be a systematic pro- 
cess to construct variational functionals from PDEs (the reverse process is called 
Euler’s method), and usually one instead shows that the proposed variational func- 
tional has the required properties. Thus we propose that the following variational 
functional has an extremal point, which corresponds to the solution of the Poisson 
equation above, with the required boundary conditions: 


W(¢) = aul vo-vods~ | f o2as (9.28) 


and we will then show that it indeed has these properties. 
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Proving the equivalence of the functional and PDE 


We will now apply what is known as a variational analysis. We postulate the fol- 
lowing: 


6’ =o+6h (9.29) 


where ¢’ is the trial solution; ¢ is the solution of the PDE; / is some (differentiable) 
function (which, rather importantly, must be zero at prescribed boundaries since 
by definition ¢ is known there) and @ is a (real valued) perturbation parameter. 
This is then substituted into the variational functional, Eq. (9.28): 


wid -+0n) = wip) +0 | | vo-vhas 


p 7) 
-—0 Pe Vh-VhdS 
€ 


The first and last terms are always greater than or equal to zero (positive semi- 
definite in mathematical terms). The term in 6@ is the first variation; what we must 
now show is that this is zero. To do this, we will use Green’s theorem, which is 
essentially multi-dimensional integration by parts: 


// uV2vdS = f u(Vv) «dC -{f VuVudS (9.30) 
S C Ss 


Using this, one finds: 


// Vo-VhdS = fneac - a hV°odS (9.31) 


Now, a subtle argument is introduced. The contour integral must be zero to elimi- 
nate the first variation. Clearly, on S$), h = 0 by definition, since the value of ¢ is 
known. If ae = 0 on $9, then we have achieved our aim. This, of course, is just the 
homogeneous Neumann boundary condition. 

The other surface integral term yields 


- ffavoas= ff Pn as (9.32) 


since ¢ is the solution of the PDE. This cancels with the other term in 6. Hence, 
the first variation is zero, subject to either Dirichlet boundary conditions on Sj or 
homogeneous Neumann boundary conditions on $2, and we have shown what we 
set out to achieve. 

In the finite element procedure, we actually perform the operation in the inverse 
order. Minimizing® the energy functional, by differentiating with respect to the 


6 In general, one should rather speak of rendering the functional stationary, or finding the extremal point, but for 
this problem the functional is indeed minimized. 
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free potentials, is equivalent to forcing the first variation to zero; given prescribed 
boundary conditions on S$}, we then naturally enforce homogeneous Neumann 
boundary conditions on $2. (It is worth noting that the latter boundary condition is 
enforced in an average sense on S$; that is, it is not exactly enforced at each point 
on S>.) 


Summary of boundary conditions 


The issue of boundary conditions is so important with the FEM that it deserves 
to be highlighted. There are two types of boundary conditions in elementary FEM 
analysis. 


e Dirichlet boundary conditions: these are essential and must be explicitly set. 

e Homogeneous Neumann boundary conditions: these are natural and are implicitly en- 
forced. An homogeneous Neumann boundary condition corresponds to a symmetry 
plane; it is often used to reduce the computational domain. 


The reason that it is so important to be aware of this is that even if one is only 
using an FEM code and has no intention of ever writing one, code developers as- 
sume that users know this. In particular, it is very important to realize that an unset 
boundary condition is not an error in the FEM process: it is a natural homogeneous 
Neumann boundary condition. 

As mentioned earlier, more complex boundary conditions may also be encoun- 
tered, including inhomogeneous Neumann boundary conditions and mixed bound- 
ary conditions. 


Boundary conditions at material interfaces 


One of the great strengths of the FEM is that handling inhomogeneous regions 
is very simple. There are, however, one or two subtleties worth highlighting. The 
boundary conditions on the electrostatic potential at the interface between regions 
1 and 2, with appropriate dielectric constants, are: 


oi = ¢2 (9.33) 
0 ) 
cnt sent (9.34) 


The former comes from the requirement of tangential electric field continuity, the 
latter from normal electric flux continuity. 

With the connection matrix approach, we force potentials to be continuous at a 
material interface. It turns out that the latter is a natural boundary condition of the 
variational approach. This is an important point. To show this, one starts with 


e 


V-eVe=—* (9.35) 
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and the variational functional 
1 
Wo) = >| evo vos — [ gp dS (9.36) 


and proceeds with an analysis along exactly the same lines as before, but with 
the domain split in two.’ Two additional terms then appear in the first variation, 
representing the flux continuity condition at the interface. From the stationarity 
requirement, flux continuity follows (for details, refer to [4, Section 3.2]). 

Within a code, the above is usually entirely invisible to the user. 


9.2.5 The Poisson equation: incorporating a source term 


Including the term —1 J ¢p dS representing the source in the functional results in 
anew matrix, [T]. (Again, for historical reasons, this is sometimes called the mass 
matrix.) The (known) source term p¢ is discretized using the same interpolation 
scheme as @, i.e. first-order triangular finite elements in this case, but with known 


coefficients. The entries in [T] are computed from 


Ts = If. aja; dS (9.37) 


with a the nodal interpolation functions as before. 
The result is a matrix equation of the following form: 


1 
[Srrltost = Tito} — [S¢pl{op} (9.38) 


It is interesting to note that the inhomogeneous part of the PDE ({:¢}) plays the 
same role in the finite element system matrices as the inhomogeneous part of the 
boundary conditions ({@p}). 


9.2.6 Discussion 


This completes our introductory discussion of the method. An obvious extension 
for the Laplace (and Poisson) equations is to introduce higher-order elements, us- 
ing quadratic, cubic, quartic or even higher. This has been very comprehensively 
addressed in [2], and for static problems works very well. However, for dynamic 
problems, our main interest, we need to introduce a different type of element, 
called variously the edge element, vector element or Whitney element, so we will 
not pursue scalar elements any further. However, before we can address vector 


7 The extension to an arbitrary number of different materials is obvious. 
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elements, we need to introduce a concept widely used in FEM analysis, namely 
simplex coordinates, the topic of the next section. 


9.3 Simplex coordinates 


Simplex coordinates — also known as homogeneous or barycentric (or in 2D, area) 
coordinates — provide an entirely local geometrical description within a triangle 
(in 2D) or tetrahedron (in 3D). This is very convenient, since it allows much 
of the work required to be done once (on what is often called the “parent” tri- 
angle) and then with some simple geometrical scaling, it can be applied to any 
triangle or tetrahedron. They are intimately linked to simplicial elements — the 
simplest possible geometrical shape in the space, that is line elements in one di- 
mension, triangles in two dimensions and tetrahedra in three dimensions. (The 
concept can be extended to higher dimensions, but loses any geometrical inter- 
pretation.) 

In general, simplex coordinates are defined as the ratios of lengths (1D), areas 
(2D) or volumes (3D) that a point in the interior (or on the boundary) splits the 
line/triangle/tetrahedron into. The size o (S) of a simplex S is defined as: 


1 (1) (2) (N) 
siete 1 


Xy Xy x 
(1) (2) (N) 

1/1 X5 Xy Suc Sy 
; (1) (2) (N) 
Toxyar *war ces FH 


where superscripts denote space directions and subscripts denote vertices. 


9.3.1 Simplex coordinates in one, two and three dimensions 


In one dimension, we have 


1 x 
— o(S1) 1 x2) x2 —X 
Ne ae fg ak 
1 x 
o(S2) |l x| x-—x 
ho = = = 
a(S) L i 


These express the ratios of length from the right and left nodes respectively to 
point x, to the total length of the element. These are frequently encountered in 
MoM analysis as local coordinates. 
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In two dimensions, we have 


ae o(S}) 
a(S) 
1 x y 
1 xm yy 
1 x3 y3 
ay, 
_ (G2y3 — x3y2) + (V2 — y3)x + 3 — x2) 


oer (9.40) 


This represents the ratio of the area of the triangle P23 to 123 — see Fig. 9.3, in 
Section 9.6.3. It will be noted that A; = a1, the first-order interpolatory function 
used in our earlier analysis, indicated how convenient the simplex coordinates are 
for functions defined over a triangle. There are three simplex coordinates in 2D: 
A1, Az and A3, describing the three area ratios. 

In three dimensions, we have 


pe o(S1) 
a(S) 
1 x y Zz 
LS eS VF: SB 
1 x3 y3 23 
_|l x4 ya 24 
7 6V 


This represents the ratio of the volume of the tetrahedron P234 to the volume of 
the element. 


There are four simplex coordinates in 3D: 41, A2, 43 and Az, describing the four 
volumetric ratios. 


9.3.2 Some properties of simplex coordinates 


Aside from the interpretation as the ratio of sizes, simplex coordinates have other 
important properties. Some of these are as follows. 


e The coordinates are normalized, thus ye + 


5 ae =1. 
e In two and three dimensions, the gradient of each simplex coordinate is a constant, and 


normal to the relevant edge (2D) or face (3D). In 2D, for example: 


VA; = ——Ni (9.41) 
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with A the area of the triangle, J; the length of edge i, and n; the normal to edge i. This 
property is extensively exploited in vector elements, of which more later. 

e Because of the normalization, 0 < A; < 1 Vi. This can be a useful and quick test to see 
whether a point lies inside or outside an element. 


9.4 The high-frequency variational functional 
For electrodynamic problems, subject to the deterministic vector wave equation, 
1 isd oad > 
Vx —V x E—ke,E = —jkoZoJ (9.42) 
Lr 


with J a source internal to domain Q and ko the free-space wavenumber, the equiv- 
alent variational functional which must be rendered stationary is:* 


> 1 > > > > 
Fé) = [ [iv x BP ~ ie EP | 12+ jkozZo | E-JdQ (9.43) 
Q r Q 


This assumes either homogeneous Dirichlet or Neumann boundary conditions or a 
mixture of the two on the boundary of domain Q. 
A closely related functional for the source-free vector wave equation 


1 > > 
Vx —VxE-keE=0 (9.44) 
br 
is the following: 
P 1 2 2. 1 p2 
F(E)= —|Vx El’ —kre|E|*| dQ (9.45) 
QLEr 


subject to the same boundary conditions. In this case, the solution is the set of 
eigenvalues k; and associated eigenvectors E e 

In order to show the above properties, one proceeds in a fashion similar to the 
Poisson equation, using a vector Green’s theorem for the double-curl operator. The 
details are available in [2, 4] and although more complex than the Poisson case, 
the method is the same, so we will not repeat them here. 

This form (often called the curl-curl form) has been used for high-frequency 
FEM analysis for many years. However, although it appears fairly straightfor- 
ward to discretize, it turned out to have a number of problems which occupied 
analysts for some years. One of the most important advances was the introduc- 
tion of vector (edge) elements in the late 1980s, and this is the topic of the next 
section. 


8 This is actually the functional for lossless materials; see [4, Chapter 6] for further discussion of this. 
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9.5 Spurious modes 


One of the supposed strengths of the FEM was its accuracy, in particular when 
compared to a method such as the FDTD, until serious problems with “spurious 
modes” were found using standard (node-based) FEM for electromagnetic eigen- 
value problems (we will define these later). The traditional, nodal FEM approach, 
typical of structural mechanics, deals with a vector field by approximating each 
component separately: 


E,® >) Ex f(,y,.2 (9.46) 


with f(x, y, z) a standard basis function such as those we have already seen (al- 
though extended to three dimensions). This was then repeated for Ey and E, and 
substituted into Eq. (9.43) or (9.45). As Silvester and Pelosi comment [1, p. 8]: 


The first approach (nodal elements) may be called the structural mechanics approach. ... at 
least some theory and much practical experience should be transferable to electromag- 
netics. Further, it has the appeal of simplicity and familiarity. The same approximating 
functions can be called upon to serve for both scalar and vector cases, and the vectorial 
coefficients have clear meaning as component representations of E or H....the structural 
mechanics approach has one major flaw for electromagnetic field analysis: it doesn’t work 
very well. The reason is simply that the fields that occur in structural mechanics and those 
encountered in EM are fundamentally different. The electromagnetic field vectors not only 
obey the Maxwell curl equations, but they are also constrained by the divergence equations. 


Before discussing some of the more intricate details of spurious modes, we note 
an immediate and practical problem with the nodal approach: since the field is ap- 
proximated by its values at the nodes, if we use the method we used for the static 
problems for connecting the elements (that is, all values at a node are set equal on 
all the elements which share the node), then the result is that we force all compo- 
nents of the field to be continuous. At an interface between two different types of 
material, only the tangential components of E or H should be continuous. If the 
material boundary happens to coincide with a plane parallel to one of the coordi- 
nate axes, then it would not be too difficult to arrange that we do this with only the 
tangential field components, leaving additional degrees of freedom to permit the 
normal field to be discontinuous. But in general, we are unlikely to be so fortunate, 
and the material interface will create a very tricky problem indeed. 

By comparison, the vector FEM approach approximates the full vector field 


Ew So eijtbij (9.47) 
with the edge-based vector function: 


Bij = AGVA; —AFVAi (9.48) 
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As before, 4; is the simplex coordinate with respect to node i. This is then used to 
discretize Eq. (9.43) or (9.45). 

It is far from immediately apparent why what appears to be a minor change 
in approach should yield significantly better solutions — after all, the vector basis 
functions are simply another way of representing the vector nature of the problem. 
In order to understand this, we need first to look a little more carefully at the 
high-frequency functionals. We will start with the eigenvalue problem, where the 
problems originate. 

Following the standard discretization and substitution of the basis functions, 
the stationary points of the functional, Eq. (9.45), correspond to solutions of the 
following generalized eigenmatrix equation: 


[S]{e;} = kIT Hei} (9.49) 


where [S$] and [7] represent the discretized versions of the first and second terms 
in Eq. (9.45). The eigenvalues k; represent the resonance frequencies of the cavity, 
and the vectors {e;} the eigenvalues, i.e. the various resonant modes (or eigen- 
modes). 

Various approaches are now possible. A particularly revealing one is to note that 
the divergence constraint, 


V-«E=0 (9.50) 


is implied within the functional, but in frequency dependent form. We can see this 
by taking the divergence of both sides of Eq. (9.44); noting the vector identity 
V-Vxa=0 V@, it is clear that 


keV -¢-E=0 (9.51) 


For the dynamic case (that is, k; 4 0) the divergence equation is indeed satisfied. 

The problem, however, enters via the other possibility for satisfying this equa- 
tion, namely k; = 0. In this case, the divergence equation is no longer necessarily 
Satisfied. This corresponds of course to the static case, where E= —VV, and we 
note (since V x VV = 0 VV) the theoretically infinite number of solutions of the 
form of the field as the gradient of a potential and zero eigenvalue {(VV, 0)} also 
satisfies the vector wave equation, constituting its null-space (also known as ker- 
nel, abbreviated ker, in some of the literature). 

A particularly elegant example of such a null-space eigenmode for the well- 
known rectangular waveguide problem was given by [6], and it is so illuminating 
that it is worth repeating here. Peterson considered the classic eigenvalue prob- 
lem of a rectangular waveguide with PEC walls, dimensions a by b, with the 
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solutions 
= nn max | nity 
Emn = —X—_ cos sin —— 
a 
aie al sin ae cos a (9.52) 
a a b 
with eigenvalues: 
2 mm \2 ni \2 
kin = (“) +(F) on 


This is very well known and features prominently in almost any undergraduate 
electromagnetic text. However, these texts never mention that there is another valid 
solution of the vector wave equation, viz. the static solution: 


=spur MIC mux . ny 
En SX cos sin 
a a b 
nw , mux ni 
+y sin cos us (9.54) 
b a b 


with eigenvalues ky, = 0. 
The spurious solution(s) look almost identical to the waveguide solutions, but 
are critically different — note that they can be written in the form 


in —— 9.55 
5: sin b ( ) 


Also very importantly, unlike Eq. (9.52), these static solutions do not have zero 
divergence, as can quickly be established by inspection. 

Because the eigenvalues of these “spurious modes” are zero, these are simply 
rejected as unwanted solutions when one does an analytical solution of the prob- 
lem (using separation of variables, for instance). However — and this is a critical 
point! — the variational functional admits these solutions, and the finite element 
procedure will also compute them. (Unless, that is, one can modify the functional 
to exclude these solutions — there has been success with such approaches and we 
will mention this again later, but the formulation is somewhat more involved.) 

So, to summarize, due to the properties of the high-frequency variational func- 
tional, the finite element procedure will produce not only the wanted, dynamic 
eigenvalues and eigenvectors, but also a number of “zero” eigenvalues and asso- 
ciated static eigenvectors. Since the finite element solution is of course approx- 
imate, the “zero” eigenvalues will not be exactly zero, but may shift up in fre- 
quency. If their values become sufficiently large, they may creep into the range 
of the dynamic eigenvalues and we will no longer be able to distinguish between 
the dynamic eigenvalues and these (very poor) approximations of zero. In this 
case, we have a “spurious mode” — an eigenvalue and associated eigenvector in the 


= _ MIX ni 
Em = V (sin *) 
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high-frequency range, but not satisfying the divergence criteria and hence entirely 
unphysical. 

There was significant confusion in some of the earlier papers on edge elements, 
as they were then known, and when reading some of these, one may find claims 
that edge elements entirely eliminate spurious modes. This is not correct — edge 
elements still compute these modes, but with better fidelity, so that they do not 
corrupt the desired range of eigenmodes. There have been other approaches which 
aim to eliminate the spurious modes entirely, but edge elements do not accomplish 
this. 

Regarding deterministic problems, since k; 4 0 will have been set in a deter- 
ministic problem, the numerical process, now being capable of reproducing an 
irrotational mode spectrum, instead ensures that such a modal content is absent 
[2, p.313].? 

It is interesting that spurious modes were not encountered in the FOTD commu- 
nity. The reason is that the Yee grid implicitly satisfies Gauss’ laws (the divergence 
criteria). 


9.6 Vector (edge) elements 
9.6.1 An historical perspective 


What are now called vector elements, but were originally known as “edge-based” 
elements, date back to the 1980s in CEM, although the underlying ideas of the 
structure of the electromagnetic field date back to 1957 and what are known as 
Whitney forms. In 1980, the French mathematician J.C. Nedelec published a pa- 
per which has since become the canonical reference in this field [8] although, 
ironically, he did not define the edge-based element itself; instead, the paper in- 
vestigates the structure of the polynomial spaces which the basis functions should 
span in a highly mathematical format, which is not readily accessible to electronic 
engineers. (He was clearly influenced by earlier ideas of Raviart and Thomas [9] 
and it is useful to read their paper before attempting to read Nedelec’s.) Some 
of the earliest work in electrical engineering is due to Bossavit [10]; Barton and 
Cendes [11] were among the first to address high-frequency electromagnetics with 
edge elements and their derivation is the one now generally given. Another type 
of related element, also a vector element, was the hexahedral element, originally 
introduced by Welij in its lowest order straight-sided form in 1985 [12], and in 


9 There is another school of thought on this topic. It has been argued that the driven solution can be viewed as a 
sum of eigenvectors, and hence incorrect eigenvectors may also corrupt a deterministic problem [7, p. 408]. In 
any case, by either argument, edge elements also lead to better solutions for deterministic problems. 
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generalized form by Crowley, Silvester and Hurwitz in 1988 [13]. Cendes’ subse- 
quent work produced one of the first higher-order tetrahedral elements [14]. Webb 
and Forghani’s work on hierarchal tetrahedral elements was the standard reference 
for many years [15], until succeeded by Webb’s later work [16]. 

During the 1990s, many researchers made excellent use of these elements and 
also advanced the theory underlying them. The following is only a selection of 
the work: Lee and Mittra worked on cavity eigenvalue problems [17] (and this pa- 
per remains useful today, since it contains analytical expressions for the elemental 
matrices); Dibben and Metaxas used edge elements for time domain analysis [18]; 
Savage and Peterson introduced alternative higher-order tetrahedral elements in 
[3]; Jin, Volakis, Kempel and their students made significant contributions to appli- 
cations, especially cavity backed patches (this work is well summarized in [4, 19]), 
and also in new hierarchal elements [20]; Dyczij-Edlinger, Peng and Lee made ad- 
vances in understanding the impact of the low-frequency ill-conditioning of the 
curl-curl formulation [21]; Graglia, Wilton and Peterson made progress with inter- 
polatory as opposed to hierarchal elements [22]; and the present author extended 
work on waveguide analysis using higher-order mixed and complete elements [23], 
and with Botha, worked on error estimation [24]. 


9.6.2 Theory of vector elements 


With this historical background, we now return to the elements. Before we study 
them in detail, we will first look at the impact they had on CEM. Although much of 
the early literature concentrates on the “spurious mode” problem, there are prac- 
tical reasons which make these elements very useful in analysis. Firstly, for the 
lowest order elements, the degrees of freedom are proportional to the tangential 
electric field along an edge (and hence the widely used name, edge elements); 
we will show this shortly. Thus tangential continuity is very simple to enforce. 
Secondly, flux continuity is a natural boundary condition. Thirdly, it is easier to 
model corners, or other regions where the field becomes singular, since there is no 
nodal value at the singularity. Finally, they greatly ameliorated the problems with 
spurious modes: we will return to this subsequently. 

Vector elements are most easily introduced using a 2D vector element for the 
rectangular element, shown in Fig. 9.2. The field is approximated as: 


4 
E. © >" PES (9.56) 


Here, Ne is the vector basis function and Ef is a scalar degree of freedom, the 
tangential field along the ith edge in this case. The vector functions N¥ are 
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Figure 9.2 The rectangular edge element. Based on [4, Fig. 8.1]. 


given by: 
pep (e-y+2)s 
y 
Nig= pe (v-98+ 3) 5 
fig = 7 (at-24 8) 
Nig= 2 (sat) 


with (x¢; y¢) the coordinates of the center of the element, and /; and /\, the element 
lengths in x- and y-directions respectively. 

Now, note the following: N | is zero on edge 2 (since y = yé +1 y/2 everywhere 
on edge 2) and it is unity on edge 1; also, it is purely tangential (x-directed) along 
this edge. On edges 3 and 4 it increases linearly from the top to the bottom, and it 
is purely normal (x-directed) along these edges. One quickly establishes that Ng 
has the same properties, but with edges 1 and 2 interchanged, and that Ng and N 4 
also have similar properties, but obviously with x and y interchanged. In short, 
these basis functions provide a mixed-order approximation of the field — on the 
edges, the approximation is constant tangentially, and linear normally. (Indeed, 
these elements are frequently called CT/LN elements, constant tangential/linear 
normal.) Note also that due to these properties, Ej is the tangential field along 
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edge 1, and similarly E5, E3 and E{ are the tangential fields along edges 2, 3 and 
4 respectively. These are the degrees of freedom for this element. Very importantly, 
these properties permit enforcing tangential continuity without affecting the nor- 
mal components, and this is precisely the boundary condition required by EorH 
fields, or indeed any 1-forms in the language of differential forms. 


A mathematical aside — differential forms 


Some of the work on vector elements uses the mathematics of differential 
forms — Bossavit is one of the main proponents of this [25]. Although the ideas 
can be readily understood without any knowledge of this field, it is useful to 
know a smattering of the terminology. 


e 0-forms: this is a scalar function with functional but not derivative continuity, an ex- 
ample being the electric static potential ¢. 

e 1-forms: these are vector functions with tangential but not normal continuity, such 
as E. These are also known as polar, or true, vectors, and are time-even under time 
reversal. 

e 2-forms: these are vector functions with normal but not tangential continuity, such as 
B. These are also known as axial vectors, or pseudo-vectors, and are time-odd under 
time reversal. 

e 3-forms: discontinuous scalar functions, such as V - D. 


For an elegant discussion of polar versus axial vectors, and time symmetry, Feyn- 
man’s chapter on this is a classic [26, Chapter 52]. 


Note that this element is not by design interpolatory, although for this lowest 
order element it can be made thus.!° The degrees of freedom (E}, E5, ES and 
E%) represent field quantities along an edge; indeed, in Nedelec’s original work, 
they are defined as integrals of the tangential field component along the edge, i.e. 
the average tangential field value. This is quite different to the nodal elements 
discussed earlier. 

We should also comment that there are a variety of names for this element, in- 
cluding mixed order; “first” order; “half-th order”, Ho(curl); and as already men- 
tioned, constant tangential/linear normal (CT/LN). This last is especially insightful 
and is the present author’s preference. 

These elements have other additional significant properties. Interestingly, by 
taking Z x NE, another class of elements is derived with the complementary 
property of providing normal continuity; these are useful for problems involving 


10 The degrees of freedom have been interpreted as the tangential field value at the center of the relevant edge by 
some researchers who have worked with interpolatory vector elements. 
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current, or 2-forms. Furthermore, we have already seen that the full-wave func- 
tional has a term of the form ge V X E-V x EdS. tis important to note that 
x-directed terms linear in x do not contribute to this term; i.e. these would be 
“wasted” degrees of freedom, which have been removed from these elements. 
This observation, at heart, was the core of Nedelec’s contribution. Finally, within 
the element, the approximated E field has zero divergence. (Recall that this is 
not explicitly enforced in the curl-curl functional.) Because the spurious modes 
are associated with solutions with non-zero divergence, many early papers on 
vector elements concentrated on this property. Whilst low-order vector elements 
are indeed divergence free within the elements, the divergence is discontinu- 
ous at element boundaries, and furthermore, a number of successful vector el- 
ements are not divergence free. (Indeed, an argument has be made that since 
one is not removing the spurious modes, but computing them more accurately, 
the element should not be divergence free!) The superior suppression of spuri- 
ous modes is now understood to be due to a better approximation of the null- 
space of the vector wave equation, that is, the zero frequency solutions we dis- 
cussed above. The vector elements do a better job of representing these static Vo 
eigenmodes; the reason is that the tangential-continuity-only of the vector ele- 
ments admits a larger number of functions in the null-space. We noted earlier 
that @ should be continuous, implying that V@ must be tangentially continuous 
(which is all that is imposed by edge elements), but the natural boundary condition 
permits the normal derivative to exhibit the correct jump discontinuity at mate- 
rial interfaces. Webb’s 1993 paper remains one of best discussions of edge-based 
elements [27]. 


9.6.3 Vector elements on triangles — the Whitney element 


Our preceding discussion considered rectangular elements. As mentioned on sev- 
eral occasions, one of the main advantages of the FEM over the FDTD is the 
geometrical modelling flexibility afforded by triangular and tetrahedral elements 
in two and three dimensions respectively, so it is important to understand how the 
same properties can be obtained for these types of elements. 

Vector elements on simplicial elements are defined in terms of simplex coor- 
dinates. Again, these have acquired a variety of names during their development, 
including Whitney, Nedelec, Bossavit or simply edge-based elements. In its lowest 
order form, the element has the following definition: 


wip = AVA; —AGVAI (9.57) 


There are three such elements per triangle, or six per tetrahedron, each associated 
with the edge from node i to node j, as will now be demonstrated. 
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edge 3 edge 2 


edge 1 


Figure 9.3 The right-angled parent triangle. 


The Whitney element is the basis for all vector simplicial elements, both in- 
terpolatory and hierarchal, so its properties are of great importance. Firstly, an 
obvious question is, why does it have this specific form? To answer this, it is use- 
ful to study the right-angled triangle shown in Fig. 9.3, of unit length along the 
x- and y-axes. (It is also a useful exercise in understanding simplex coordinates.) 
The simplex coordinates are the ratios as follows: 


ee areap p23 
area, 123 
1/2 base x height 
7 1/2 
=y (9.58) 
since the area of triangle 123 is 1/2, and the base of triangle P23 is unity and its 
height is y. 
Similarly, 


A2=1—-(x+ y) 
A3 =x (9.59) 
The expression for Az is easily derived from the property yar Aj = 1. Now that 
we have explicit expressions for the simplex coordinates, their gradients follow 
trivially: 
VAL=y (9.60) 
VA2 = —-xX -—y (9.61) 
Via=x (9.62) 
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We note that VA; is normal to edge | (that is, the edge opposite node 1), and 
similarly VA2 and VA3 are normal to edges 2 and 3 respectively. 
Now, the Whitney functions can be written in explicit Cartesian form as follows: 


=. 


N = W23 = A2VA3 — A3VA2 
=—U=s= yr ary) 
(1 — y)k + xy 
No = —W13 = —yk + xy 
Nee ag nee (9.63) 


These are illustrated in Fig. 9.4. 

Due to the simple form of these functions on this right-angled parent element, 
we can immediately establish some of the crucial features of these functions. Let 
us focus on N | = W23. Along edges 2 and 3, this function is purely normal, and 
increases linearly from node | to node 2 along edge 3, and similarly from node 1| to 
node 3 along edge 2. Along edge 1, it has both tangential and normal components. 
These are easily separated on this right-angled parent element; on edge 1, they are 
the x and ¥ components respectively, that is, (1 — y)|y—o0 = 1 and x respectively. 
Thus, on this edge, the tangential component is constant, and the normal com- 
ponent is linear. In short, N | = W3 is a basis function with a constant tangential 
component on edge 1, and linear normal components along all the edges. The same 
is easily shown for the other two basis functions. Hence, this Whitney element has 
the same mixed-order CT/LN behavior as the rectangular element studied earlier. 
Furthermore, suitable degrees of freedom are again the average tangential fields 
along each edge. It is also immediately obvious from Eq. (9.63) that the diver- 
gence of the Whitney functions is zero. 

An important note: although we have established these properties on a right- 
angled parent element, they are generally true for Whitney elements on any trian- 
gle; we will not however show this now. (Some further discussion on the Whitney 
element may be found in Appendix A.) 

Another important point: what of the normal field components? The boundary 
condition in this case is normal flux continuity; it turns out that this is a natural 
boundary condition of the variational process, and hence is automatically satisfied 
at material interfaces [4, Section 5.8.3]. 

It is an interesting question to ask why this function might originally have been 
proposed. Firstly, as already noted, the gradient of a simplex coordinate is constant, 
and is directed perpendicular to the edge opposite the relevant node. Hence, using 
the gradient of the simplex coordinates promises a method to separate normal and 
tangential components, which it will be recalled is highly desirable, due to the 
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Figure 9.4 The three Whitney basis functions for triangles. 
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continuity requirements of vector fields. Now, using VA; alone can only given a 
constant approximation; multiply by 4; and a linear form is obtained. To make 
this linear along an edge, both non-zero simplex coordinates are needed. Thus 
Ai; VAj =A; VAj; is a reasonable guess. The + form can be rewritten as V(A;A;), 
which is in the null-space of the curl operator (the first term in the functional), 
hence the — form is a good guess. Our detailed analysis above on the right-angled 
parent element confirms this. 

In closing this introductory discussion on Whitney elements, it is very impor- 
tant to note that that the vector field can only be recovered by the vector sum of 
the three vector basis functions and the appropriate amplitudes (the degrees of 
freedom which the finite element procedure yields); the degrees of freedom lose 
the convenient interpretation of nodal elements as a field component value at a 
node. 

Whitney elements revolutionized HF FEM analysis from the mid 1980s on; 
Ansoft’s Eminence package (now HFSS) was one of the first commercial codes 
to exploit these elements for the three-dimensional finite element analysis of high- 
frequency devices. Extending the elements to higher order has been a controversial 
topic; many different forms have been published. The most comprehensive publi- 
cation in the electrical engineering literature is Webb’s relatively recent work [16]. 
A comparison of a number of these elements has been published by the present 
author [23]. This is discussed in Chapter 10. 


9.7 Application to waveguide eigenvalue analysis 


9.7.1 The two-dimensional variational functional 
for homogeneous waveguide 


Waveguide eigenanalysis is one of the classic applications of the FEM. It is useful 
in its own right, but also serves as an excellent tool to illustrate the application of 
vector elements. We will analyze a rectangular waveguide, homogeneously filled, 
since then the eigenmodes split into pure TE and TM modes; an inhomogeneously 
loaded waveguide requires a more complex approach since the propagation modes 
are then hybrid in nature. (A discussion of this and suitable formulation may be 
found in [4, Section 8.2].) The functional for the transverse field components, sub- 
ject to the prescribed boundary condition n x E + = 0, is: 


an 1 1 > > = > 
F(E;) = st. |, KE) (Vix ED he Ee é,| dS (9.64) 


This is Eq. (9.45), with E, = 0, ie. no longitudinal field components, and again 
assuming lossless materials. The eigenvectors of this eigenvalue problem are the 
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TE modes, and the eigenvalues k; are the corresponding cut-off wavenumbers,!! 


with k, = 0. V; is the transverse del operator. 

It is important to note that the prescribed boundary conditions, which amount 
to the edges lying on the PEC, must be explicitly enforced. This implies that the 
vector of unknowns, {e}, in the generalized eigenvalue problem: 


[S]{e} = kIT fe} (9.65) 


would appear to include prescribed, i.e. zero, values. This is incorrect. It may be 
shown that this equation includes only contributions from the free edges, i.e. 


[Sppllep} = k7[T ep lef} (9.66) 


To derive this, write the discretized functional before it is rendered stationary as: 


F{e} ={erep}" fea {erept (9.67) 
pf pp 

Now, differentiating with respect to the free edges, and then applying the pre- 
scribed boundary condition {e,} = 0, one obtains Eq. (9.66). 

This does of course require (globally) numbering the free edges first, and then 
the prescribed edges. If using a connection matrix approach, another renumbering 
matrix could be used afterwards to implement this. Alternatively, during matrix 
assembly, any entries corresponding to prescribed edges can simply be removed 
from the system. 

If the TM modes are sought, then Eq. (9.64) must be solved with H, as the 
working variable. In this case, homogeneous Neumann boundary conditions are 
appropriate — i.e. no explicit boundary conditions need be set at all. 

This problem is especially easy to solve using rectangular elements, but since 
we would like to illustrate the application of the Whitney elements, we will use tri- 
angular elements. Firstly, we will need a mesh of such elements, but we will defer 
consideration of this until later, and concentrate initially on the theoretical analy- 
sis. For each element, we need the elemental (that is, stiffness [S] and mass [7']) 
matrix elements. Using simplex coordinates, we can evaluate these quite easily. 


9.7.2 Explicit formula for the elemental matrix entries 


Before deriving expressions for the elemental matrices, it is worth briefly review- 
ing the two approaches which have been used. The approach we will use is es- 
sentially a direct approach, where we evaluate the simplex coordinates in terms of 


'l This is a special case of the more general functional [4, Eq. 8.36], which includes non-zero values of kz. 
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the Cartesian coordinates of the actual element. The other approach uses the right- 
angled parent element of Fig. 9.3, and computes the matrices for this element; a 
coordinate transformation is then performed to the actual element, and the inverse 
of the Jacobian of this transformation is used to scale the matrix elements. The 
former approach is that of Lee and Mittra, who published some of the first explicit 
formulas in [17] for tetrahedral CT/LN elements (these formulas were extended 
by the present author to diagonally anisotropic materials in [28]), Savage and 
Peterson, who presented a very useful alternative formulation in [3], and Jin [4]. 
The latter approach is best exemplified by [7]. Savage and Peterson’s approach 
leads to particularly compact expressions, and is the one we will use here. The fol- 
lowing is based on their work, but simplified to triangles, using notation consistent 
with that of this chapter, and using the standard Whitney elements. (Savage and 
Peterson further scale the elements by the edge lengths.) 
Recall that the variational formulation requires the evaluation of two matrices: 


sj= ff Vx MV, x Nd (9.68) 
Ss 
and 


ty = | [ ®-fyas (9.69) 


With z the direction of propagation, the V;x and V; operators reduce to the two- 
dimensional operators in the (x, y) plane, which we will imply in the following. 
The CT/LN elements are given by N; = Wi li2 = Aj, VAi2 — Ai2VAi1 per edge. 
Here, i1 and i2 are the endpoints of edge i. The local triangular numbering scheme 
is as already discussed. 
Now, the three simplex coordinates A; are given by 


Ai =a, +bix+cjy (9.70) 
and the gradient thereof by 
VA; = bX +49 (9.71) 


The actual coefficients {a;; b;; c;} may be computed by inverting the coordinate 
matrix 


—1 


bh cy ay X{ XQ X3 
by cz a2|/=]y1 yo Y3 (9.72) 
ba Gates a oe 


This equation may be obtained by writing Eq. (9.70) for each node i and not- 
ing that A; = 1 at node 7. Now the following two vectors are defined for nodes i 
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and j: 
vj j= VA x VA; 


= 2(bic; — bjci) 


= Uji (9.73) 
This vector is easily computed once {b;; c;} are known. Similarly we define 
ij = VA VA; 
= bjbj + cic; (9.74) 


Note that both v;; and ¢j; are constant within a triangle, and hence may be taken 
outside integrals in which they appear. 
Consider the evaluation of the curl-curl term, Eq. (9.68): 


V x Np = V x (Ai VAi2 — Ai VAi2) 
= V x (AjpVAj2) — V X (Ai2 VAs) 
= 2Vii1 X VAj2 
= 20;1,72 (9.75) 


From the second to third line in the above, the vector identities V x (pA) = Vx 
A+V@¢ x Aand V x V¢@ = 0 have been used. 
Using this, Eq. (9.68) becomes: 


Sij -4// Vi1,12° Vjl,j2dS 
S 
= 4A¥i1,12 - Uj1,;2 (9.76) 


Note that the widely used expression for element area in terms of the determi- 
nant of the coordinate matrix, 


Lo Xie V4 
2A’=|1 x» yo (9.77) 
1 x3 3 


actually yields a potentially signed area A’, whose sign depends on the sense 
(clockwise or anticlockwise) of the coordinate numbering. A in the above is the 
unsigned area of the element, that is A = |A’|. 
The second term that appears in Eq. (9.69) requires the computation of dot prod- 
ucts: 
Nj + Nj = (Ai VAi2 — AiaVAi1) - Ap VAj2 — AVA; 
= [Ai Aji (VAi2 + VAj2) — AA j2(VAi2 + VAj1) 
—Ai2Aj1(VAiL + VAj2) + Aj2Aj2(VAi + VAs) (9.78) 
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Using the notation of Eq. (9.74), this can be written as: 
Ni + Nj = [AitAj1@i2,j2 — Ai1A j2@i2, 71 — Ai2%j1Gi1, j2 + Ai2Aj2i1,j1] (9.79) 


Thus the associated matrix elements become: 


Tj = $i2, 2 // dary dS — dr, ff Aihjads 
. s 
—9i1, j2 // vary ds + on, ff Aj2dj2dS ay 
® s 


Using the general integration formula for integrals in simplex coordinates 


(2, p. 458]: 
a Qi! jl k! 
Malwk as = A 9.81 
If. eo Q+itj+D! Oe") 


the expression for 7;; may be simplified (note that 0! = 1). In Eq. (9.80), each in- 
tegral involves integration over two simplex coordinates, possibly identical. These 
can be expressed in matrix form as 


,f2 11 
Miy= ff xajas=— | (9.82) 
s i at. 2 


Using this, Eq. (9.80) reduces to 
Tj; = Al¢i2, ;2Mir, j1 — G12, ;1Mia, j2 
— $11, ;2Mi2, 1 + G11, 1 Mi2, j2] (9.83) 


9.7.3 Coding 


We now have all the theory we need. However, finite element codes require a lot 
of “house-keeping” — the unstructured nature of finite element meshes is both their 
strong point (permitting very accurate local geometrical modelling) and a signif- 
icant complication (since a lot of lists need to be generated and the maintained). 
We will now discuss a number of these issues. 


Edge and node numbering schemes 


With an FEM code, adopting sensible local and global numbering conventions and 
then using these consistently is absolutely essential. The local edge numbering 
scheme we discussed earlier (whereby the edge number corresponds to the node 
opposite) is not widely used in practice. The following is the most widely used in 
the literature: 
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Edge Local edge number 


€12 1 
e13 
€23 3 


In the above, e;; is the edge directed from node i to node j. It is important to 
note that although the degree of freedom associated with the edge is a scalar, it is 
nonetheless signed. 

A convention that can be recommended is first to sort the nodes in each element 
into ascending global order. This ensures that when edges are assigned, they are 
always directed from lower to higher node numbers, and thus the edges shared by 
two or more elements always have the same sign. All the local edge numbering 
schemes in use in the literature are consistent [3, 17] (taking into account that 
some number from 0 and some from 1). (Note that the sign of the edges is not, 
however, consistent: for example, edge 3 in [4, Fig. 8.2] has the opposite sense to 
that above.) 

Global edge numbers are assigned from 1 upwards; within an element, global 
edges are incremented in the same pattern as the local edges. To illustrate this by 
example, element e1 will always contain edges 1, 2 and 3 (although not necessarily 
global nodes 1, 2, 3 and 4, of course, since these are assigned by the mesher); if 
element e2 shares its first edge with element e1, then its remaining edges will be 
globally numbered 4 and 5, local edges oa and eo respectively. 

The above sounds more complex than it is, as is often the case with finite element 
data structures, and becomes clear when coding. 


Data structures 
Before programming starts, it is useful to establish the major data structures that 
will be needed. For a mesh with N,, nodes, N. elements, and E edges, the major 
data structures required will include at least the following. 


vertices Dimensioned as (N,, 2). This stores the (x, y) coordinates of each vertex 
(node). 


nodes Dimensioned as (Ne, 3). This stores the three nodes associated with each 
element. 


edge_nodes Dimensioned as (EF, 2). This stores the global nodes that each edge 
connects. 


9.7 Application to waveguide eigenvalue analysis 323 


materials Dimensioned as (N-). This stores the material number. Another (usually 
very much smaller) data structure will be required to store the actual constitutive 
parameters for each material. 


dof Dimensioned as E for Whitney elements. These are the degrees of freedom. 


Two major data structures omitted here (deliberately) are the [S] and [T] matrices 
for the system. For initial work, these can simply be stored as full matrices, but to 
exploit fully the power of the FEM, sparse storage schemes must be used. This is 
discussed in Chapter 10. 

The above data structures are accessed so frequently that they should be globally 
accessible. In MATLAB, this is done using the global statement. In FORTRAN 90, 
one uses modules. 


Meshing 


For the beginner, this often seems the most challenging task. For 2D problems 
however, one can build quite satisfactory meshes by hand. The easiest way of gen- 
erating triangular meshes for a rectangular domain is first to divide the domain into 
smaller rectangles, and then to split each of these further into two triangles. An ex- 
ample of such a mesh is shown in Fig. 9.5. (Also shown on this plot are global 
node and element numbers; the manner in which these are assigned is essentially 


Figure 9.5 An eight-element triangular mesh. 
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arbitrary, and the finite element code should be able to handle this.) It is also easy 
to automate this type of meshing procedure. 


Book-keeping 
The issue of making the edges has already been discussed. The book-keeping re- 
quired does not end here, however. One also needs connection information (the 
equivalent of the connection matrix discussed earlier). For a “regular” triangular 
mesh such as that of Fig. 9.5, it is clear that an edge can be connected to at most 
two triangles, but in general, no such assumption can be made. 

Building the interconnectivity data is primarily a problem in list-searching. The 
simplest method of doing this is for each edge, to search through all elements and 
see whether the edge nodes coincide. This is not a good idea for large meshes, 
since this is an O(EN) ~ O(N?) operation, but for small meshes it works. Real 
codes use additional node-element lists to accelerate the search. 

One also needs some type of renumbering scheme, so that the free edges may 
be numbered first. An approach which works is first to flag each edge as free or 
prescribed. In the present case, simply checking whether the nodal coordinates of 
the edge coincide with x = 0, x =a, y = 0 or y = b is sufficient, but in general 
this can also be quite a complex search. Once this has been done, an index list is 
then built which gives the original global edge number for each degree of freedom. 
Again, this sounds more complex than it actually is. With these data, and with the 
convention that shared edges have the same sign, matrix assembly proceeds very 
quickly. 


Solving the eigenvalue problem 


From a mathematical viewpoint, the most complex part of the finite element anal- 
ysis (and certainly the most computationally expensive) is actually the solution of 
the generalized eigenvalue problem represented by Eq. (9.49), repeated here: 


[Sl{e;} = kK? [T lfe;} 


Fortunately, modern scientific programming environments such as MATLAB make 
this very simple; for instance, in MATLAB, the function eig solves this with one 
command! (Similar routines are available in LAPACK, if using FORTRAN 90 or C, 
although calling them requires a little more work.) What emerges from the analysis 
is a set of eigenvalues, each with its associated eigenvectors. 

As should be anticipated from our earlier discussion, this vector element FEA 
includes static modes. (This very important point is often not mentioned explicitly, 
and causes novices no end of problems.) Interestingly, it is possible to predict the 
number of such modes. The idea is the following. For the Whitney element, the 
curl of the field is represented by a constant. For the null-space of the eigenvalue 
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problem, where the field can be represented by a potential, this potential function 
must thus be linear. The obvious approximation of a linear potential using nodal 
elements would require one degree of freedom per unconstrained (free) node. One 
of the solutions is actually the trivial solution E=0 (corresponding to a constant 
potential) and must be discounted (since it is also a valid, albeit trivial, solution of 
the dynamic problem) and thus the dimension of the null-space, K, is the number 
of unconstrained nodes minus one for Whitney elements. Hence K can be very 
large. In the 2D case, the ratio of edges to nodes tends to around three, so almost 
one-third of computed eigenvalues are actually null-space ones. 

In practice, the trivial solution is also irrelevant. So, once the eigenvalue problem 
has been solved, we must first sort the computed eigenvalues into ascending order, 
then count the number of free nodes, i.e. K + 1, and then finally, eigenvalue K + 2 
is the first eigenvalue of interest. (Again, this type of operation is very easily im- 
plemented in MATLAB, using the sort function.) 


Post-processing 


Once the finite element analysis is complete, the vector degrees of freedom need 
to be post processed to yield meaningful field data. As has been commented previ- 
ously, unlike interpolatory nodal-based elements, where a degree of freedom typi- 
cally represents a field component at a particular node, hierarchal vector elements 
only reconstruct a physically meaningful field when summed together. In this case, 
the eigenvector corresponding to a particular eigenvalue does not in itself directly 
represent a field. Given the degrees of freedom and the corresponding basis func- 
tions, the field E (x, y) can be computed at any point within the element. 

For this, one needs to compute directly the sum of the Whitney elements within 
each element, that is: 


Ee(x, y) = Etyw12 + Ef,t13 + Ef,i23 (9.84) 


with E if the degrees of freedom and wj; the basis functions. (Here, it is worthwhile 
pointing out that some authors include the appropriate edge lengths in the basis 
function, e.g. wjj = €;;(A; VAj —A;VAj;.) The reason this is sometimes done is 
that the degree of freedom is then the tangential field at each edge. In this case, the 
[S] and [7] matrix entries are scaled appropriately [3], and the basis functions in 
Eq. (9.84) must of course also include the edge length. This is obvious, but easy to 
overlook, since the lengths are often implied but not consistently retained in some 
of the literature.) 

All the theory needed for this has already been presented. The simplex coor- 
dinates for point (x, y) are computed from its basic definition as in Eq. (9.40), 
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expanded here for all three coordinates: 


1 x y 
1 ei. 295 
1 x 
eee 3 ¥3 
2A 
Li oe: Ph 
1 x y 
1 x 
i _ 3 3 
2A 
1 eh, i 
L xy -y2 
1 x y 
43 = ————_ 9.85 
3 a (9.85) 


The gradients are computed as in Section 9.7.2, using specifically Eq. (9.71). 


9.7.4 Results 


The eigenvalues can be put into one-to-one correspondence with the analytically 
known eigenvalues. For a standard X-band guide, with a = 22.86 mm and b = 
10.16 mm, the first eight TE eigenmodes are listed in Table 9.1. 

The relative error of the eigenvalues computed with the FEM compared to the 
analytical results is shown in Fig. 9.6. Clearly, refining the mesh has the desired re- 
sult of decreasing the error. Individually, the eigenmodes display different conver- 
gence with, for instance, the seventh eigenmode (TE31) being accurately computed 


Table 9.1 First eight transverse electric modes 
in a standard X-band waveguide, giving cut-off 
wavenumber and frequency 


ke re 
Mode (rad/m) (GHz) 
TEjo 137.43 6.5573 
TE29 274.86 13.1146 
TEo1 309.21 14.7539 
TE, 338.38 16.1455 
TE30 412.28 19.6719 
TE21 413.71 19.7401 
TE31 515.35 24.5899 


TE4o 549.71 26.2292 
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Figure 9.6 The relative error in the first eight eigenmodes. 


by even the very coarse 16 element mesh. This behavior has been observed in many 
implementations, and what is usually studied is an average error. In Fig. 9.7, the 
result for the RMS error of the first eight eigenmodes is plotted versus average 
triangle length h. Theoretically, the Whitney element is complete to zeroth order, 
so the error term should be of O(h). Since the functional depends on the square of 
the field, and is stationary at the true solution, the resulting error is O(h7). We can 
confirm this on the log-log plot; this is (approximately) a straight line, with slope 
2.14 (this can be conveniently obtained using the MATLAB function polyfit). 
Hence the error E is: 


E= Kh? (9.86) 


where K is an unknown coefficient. This is a well-known result in finite element 
analysis [2, p. 148] (note that the exponent has the incorrect sign in this refer- 
ence). It is also confirmed by the interpolation error bound of ch*, with c a con- 
stant and k = 1 in this case, originally given by Nedelec [8, Eq. 22] (although 
this is not exactly the same as the overall error, which is what we are evaluating) 
if one recalls that the eigenvalue, as a stationary property, is the square of this 
estimate. (Morishita and Kumagai showed that with the curl-curl functional, the 
eigenvalue is stationary [29, Section IV]; this is also discussed by Chen and Lien 
[30].) 
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Error 


Figure 9.7 RMS error in first eight eigenmodes versus average mesh size h. 


The eigenmodes are conveniently compared visually. Figures 9.8 and 9.9 show 
the first six eigenmodes, computed analytically and with a 256 element FEM so- 
lution respectively. These results were plotted with the MATLAB quiver func- 
tion.!* Note that the sign of the eigenmode is essentially arbitrary; for instance, 
the TEo; eigenmode has been computed with opposite sign by the analytical and 
finite element methods. Also, for interest, the first six “spurious” eigenmodes are 
shown in Fig. 9.10. The wavenumbers appear to be complex; this is simply due 
to taking the square root of numbers approximating zero, but slightly negative. 
There are 105 such eigenvalues and associated eigenmodes, in a problem with 
360 degrees of freedom. One notes that, in general, these modes satisfy the bound- 
ary condition of zero tangential field, but cannot of course be recognized as tradi- 
tional TE modes. 


9.8 The three-dimensional Whitney element 


The FEM using vector elements in 3D is in a sense just a straightforward exten- 
sion of the 2D analysis; however, the mesh generation and book-keeping problems 
become formidable and we will not discuss the actual implementation of such a 
code in detail; developing a truly general-purpose 3D FEM code is a challenging 


12 Readers should note that this is a rather tricky function to use. 
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Figure 9.9 Quiver plot of the first six eigenmodes, computed with a 256 element FEM 


solution. 
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Figure 9.10 Quiver plot of the first six “spurious” eigenmodes, computed with a 256 ele- 
ment FEM solution. 


task, although developing a special purpose 3D finite element analysis code is not 
an entirely unreasonable undertaking. References that can assist in this regard may 
be found in Section 9.9. 

The three-dimensional Whitney element is exactly the same as in two dimen- 
sions 


Bij = AGVA; — AFVAi (9.87) 


with the obvious difference that there are now six degrees of freedom per tetra- 
hedron, rather than three per triangle, since a tetrahedron has six edges. This ele- 
ment has exactly the same well-known properties of constant tangential/linear nor- 
mal field (CT/LN) approximation along edges (hence, of mixed order) as its two- 
dimensional counterpart and needs no further discussion. Once again, conventions 
should be adopted right from the start; since we are going to address higher-order 
elements later, which have degrees of freedom linked to faces as well as edges, we 
need also to number faces. See Table 9.2 for one such convention. (The face num- 
bering conventions in the literature are generally not consistent. This one follows 
[3, Table II], but differs from [17], for example.) 

All the comments made in the context of two-dimensional elements are equally 
germane here; however, the coding effort is at least an order of magnitude more, 
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Table 9.2 Local edge and face numbering convention for 
3D tetrahedrons 


Local edge numbering Local face numbering 
Edge Local nodes Face Local nodes 
1 1 2 1 1 2 3 
2 1 3 2 1 2 4 
3 1 4 3 1 3 4 
4 2 3 4 2 3 4 
5 2 4 
6 3 4 


due to the complexity of three-dimensional tetrahedral meshes and the much larger 
problem size required by realistic problems, and hence we conclude our introduc- 
tory coverage at this point. 


9.9 Further reading 


This chapter has focussed heavily on vector finite elements; the explanations of 
the properties of the elements reflect what might be called the current orthodoxy. 
It should be mentioned that there has been criticism of these elements from some 
quarters, most stridently from Mur [31]. One should note that his criticism is heav- 
ily influenced by his work on magnetostatic problems, where the permeability can 
vary enormously from element to element and vector elements may indeed ex- 
hibit serious problems due to this. Recall also our earlier discussion about material 
interfaces and field continuity, and the problems with node-based elements, in Sec- 
tion 9.5; de Lager and Mur were able to introduce a node-based element which can 
indeed handle material discontinuities [32]. However, at the time of writing, this 
element had not been applied to 3D high-frequency analysis, and it seems likely 
that the current vector elements will continue to dominate finite element analysis 
in the forseeable future. On the topic of spurious modes, work by Vardapetyan and 
Demkowicz has addressed the problem at a quite fundamental level, introducing 
Lagrange multipliers in the functional; [33] is representative of their work. 

More generally, the reader is fortunate that there are a number of excellent and 
current texts on the FEM available. Silvester and Ferrari’s book [2] (first published 
in 1983, approximately doubling in length with the 1990 second edition, and in- 
creasing again in length significantly with the 1996 third edition) was for years 
the only reference in the field, and the current edition contains good coverage of 
high-frequency topics, in addition to extensive coverage of statics and magneto- 
statics. (Incidentally, the second edition contains some useful material which was 
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not in the third, and is also worth acquiring if the opportunity presents itself.) 
Jin’s text has recently been revised [4] and is probably the book of first choice for 
high-frequency electromagnetics, which it concentrates on exclusively. Volakis, 
Chatterjee and Kempel’s text also focusses on high-frequency applications, and 
contains much useful information on various elements [19]. Pelosi, Coccioli and 
Selleri’s book lives up to its name, and is a good starter text [34]. Peterson, Ray and 
Mittra’s book is somewhat more general in scope than just the FEM, but provides 
particularly deep coverage of coupled FEM/MoM formulations [35]. The text by 
Salazar-Palma et al. [7] is more of a research monograph; it concentrates primar- 
ily on interpolatory elements. The coverage is more theoretical than the other texts 
discussed here, and it is especially useful as preparatory reading if one intends 
working through mathematical papers such as Nedelec’s. 

Two other very useful sources are the 1996 anthology edited by Silvester and 
Pelosi [1]; the extensive annotations are especially useful for putting the work in 
perspective, and the anthology contains a number of earlier papers which are oth- 
erwise hard to come by. Some important papers have appeared since the anthology 
was published (and have been referenced in this chapter) but these are generally 
easily accessible. The collection edited by Itoh, Pelosi and Silvester [36] (also in 
1996) contains a number of important contributions; in the context of vector el- 
ements, [6] deserves particular mention. For readers who would like to embark 
on their own three-dimensional implementation, there are two papers which will 
be of considerable interest, since they provide an eminently practical viewpoint on 
finite element coding. The first is by the present author [37]; the second reflects ex- 
perience by Kempel’s group [38], and was written specifically to complement the 
former. In [37], a number of practical issues are discussed, but mesh generation and 
linear algebra are only very briefly considered. In [38], an excellent overview of 
the many meshing packages available is provided, as well as a discussion of some 
sparse matrix solution routines. Sparse matrix schemes are on the one hand essen- 
tially an entirely practical problem, but on the other their efficient use is essential 
for commercial codes — we will briefly discuss this in the next chapter. The book 
by Duff et al. [39] is the standard reference on this. It has to be commented that 
specifically the topic of sparse matrices is not well treated in the CEM literature 
on finite elements. 


9.10 Conclusions 


This chapter has introduced the finite element method for high-frequency elec- 
tromagnetic field solutions. Using primarily the variational formulation, the ba- 
sic method was introduced for the scalar Laplace equation, following which we 
immediately addressed vector (edge) elements for the vector wave equation. An 
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eigenvalue problem was solved, and used to illustrate ideas both about the the- 
ory of finite element solutions of the vector wave equation, as well as a plethora of 
practical issues which one must address when writing an actual finite element code. 
Two-dimensional finite element codes require only moderate coding complexity, 
and it is quite realistic to attempt development of such a code oneself. The exten- 
sion to three dimensions has been discussed. Without wishing to dissuade readers 
from attempting a full higher-order 3D finite element implementation using tetra- 
hedral elements, we should caution that getting all the aspects required working 
together efficiently, and reliably, is no small undertaking. One way of “easing” into 
this would be to start with a “brick” mesh; the bricks can be subdivided into tetra- 
hedra. (One way to do this was shown in [40, Fig. 9.5], unfortunately not repeated 
in the third edition.) Of course, this does not truly exploit the power of the FEM 
for modelling complex geometries. Another approach is to use prismatic meshes; 
Volakis, Kempel and their colleagues have been very successful with such meshes 
for a variety of antenna problems [19, 38]. One might term this 25D-modelling, 
although of course the full 3D field solution is obtained. 

In the following chapter, a variety of more advanced topics on the FEM are 
introduced. Starting with the extension of vector elements to higher orders, the 
application of these will be illustrated by way of a deterministic problem (an ob- 
stacle in a rectangular waveguide, analyzed using both commercial and research 
codes). The FEM/MoM hybrid formulation will be introduced, and some results 
shown. Then a time domain formulation of the finite element method for the vector 
wave equation is outlined. The issue of sparse matrix storage schemes and solu- 
tion methods is considered, before finishing the coverage with an introduction to 
the field of error estimation and mesh adaptation. 
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A selection of more advanced topics on the finite 
element method 


In this final chapter, we discuss a selection of more advanced topics, primarily 
relating to the finite element method. However, as will be seen, a linkage to the 
method of moments will be established, and perhaps rather less expectedly, the 
finite difference time domain method will also emerge as a special case of a finite 
element time domain treatment, so amongst other purposes, the chapter serves to 
draw together these three apparently quite different methods. 

We will start by considering a very important extension of the vector ele- 
ments, namely higher-order elements. Following this, the stationary functional 
formulation for deterministic (driven) problems will be outlined. In the pre- 
ceding chapter, an eigenvalue problem was used to illustrate the FEM in two 
dimensions; in this chapter, a deterministic three-dimensional problem will be 
discussed, namely the analysis of waveguide obstacles. Finite element analysis is 
ideal for this problem, and good results have been obtained by a number of work- 
ers. Results for two waveguide problems computed using FEM codes incorporat- 
ing higher-order elements will be shown. Then, a hybrid FEM/MoM formulation, 
which has proven very powerful for specialized applications, will be introduced, 
and an application to radiation exposure assessment near a base-station antenna 
will be presented. Following this, time domain finite element analysis is briefly 
discussed. 

We conclude the chapter with a discussion on two issues which impact on ef- 
ficiency. Firstly, sparse matrix storage schemes are briefly outlined, and secondly, 
error estimation and the use of mesh adaptation based on this is discussed. 

The coverage in this chapter is at a higher level than in much of the rest of 
this book. Generally, the topics discussed are too complex to permit a simple im- 
plementation, and the intention of this chapter is rather to sensitize the reader to 
current topics of interest in the field. Nonetheless, with the exception of time do- 
main FEM, aspects of all the topics discussed are either already incorporated in 
commercial codes, or can be expected to be available shortly. 
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10.1 Higher-order elements 


Although extending “edge” elements to higher orders became a topic of interest as 
soon as the CT/LN elements achieved widespread acceptance, it remains a topic 
of active research at present, a decade or more later. Development of such ele- 
ments raises a number of issues, including: hierarchal versus interpolatory behav- 
ior; methods for the construction of the element shape functions; the interpretation 
of the degrees of freedom; the construction of prototype elemental matrices (an- 
alytical versus quadrature); and the efficient iterative solution of the poorly con- 
ditioned linear algebra systems which unfortunately often result. Various names 
are in use: the two-part field description as used in the preceding chapter (e.g. 
linear tangential/quadratic normal, LT/QN) is particularly insightful and is used 
here. However, before introducing higher-order elements, it is worthwhile briefly 
discussing the question of completeness and vector elements. 


10.1.1 Complete versus mixed-order elements 


A family of polynomials is complete to order N if a linear combination of its 
members can exactly express any polynomial of degree not exceeding N, but no 
higher [1, p. 272]. For a complete first-order approximation of a function in x and 
y, three terms are needed; one constant and two terms linear in x and y respec- 
tively. Clearly, for a first-order complete expansion of a two-dimensional vector 
field, each component will require three terms, hence six degrees of freedom will 
be required. For a tetrahedral element, approximating a three-dimensional field, 
twelve are needed (there is an additional linear term in z for each component, and 
of course, three components). By comparison, the Whitney triangular element has 
three degrees of freedom, and the tetrahedral element six; as we have seen in Sec- 
tion 9.6, this results in certain field components being approximated by a constant, 
and clearly these elements are of mixed order. 

So many of the early papers on Whitney elements emphasized the mixed-order 
nature of the element that it is not always appreciated that being of mixed order 
is not an essential property of vector elements per se. Complete sets of vector 
elements have also been described [2], with degrees of freedom proportional to 
tangential field components, as for mixed-order elements. This permits enforce- 
ment of only tangential field continuity, as for mixed-order elements, with normal 
(dis)continuity following as a natural boundary condition, as discussed in Chap- 
ter 9. (It is easy to produce complete scalar or nodal elements, but then of course 
we are back with the inconvenient problem of having degrees of freedom repre- 
senting a Cartesian component at a point, rather than a tangential field component, 
which was one of the major motivations for the development of vector elements, 
as we saw in in Chapter 9.) For wave eigenvalue problems, such complete sets 
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Table 10.1 Webb’s hierarchal elements (to second order complete) [3] 


CT/LN (6 x 1 = 6 edge-based degrees of freedom) 
Edge-based 1 per edge GVEj — OVS: 


Additional LT/LN functions (6 x 1 = 6 extra edge-based degrees of freedom) 
Edge-based 1 per edge V (Gi c i) 


Additional LT/QN functions (4 x 2 = 8 extra face-based degrees of freedom) 
Face-based 2 per face CORN GE H GiKV | — 261055 VEx 

for {i; j; k; } = {1; 2; 3} and {2; 3; 1} 
Additional QT/QN functions (6 x 1 edge-based + 4 x 1 face-based = 10 extra 
degrees of freedom) 
Edge-based 1 per edge V (Gi filo — cj]) 
Face-based 1 per face Vv (Gi oj tk) 


After [4], ©2003 IEEE, reprinted with permission. 


of vector elements produce “wasted” degrees of freedom, as we have already dis- 
cussed. In essence, Nedelec’s constraints provide mixed-order elements that model 
the curl-space as efficiently as possible, for a given number of degrees of freedom. 
However, not all problems, in particular deterministic ones, share these character- 
istics. Recent work by Webb [3] and the present author [4] has indicated that some 
vector electromagnetic problems are more efficiently analyzed using complete- 
order vector elements, typically when the solution is dominated by electric fields 
strongly “gradient” in nature. 


10.1.2 Hierarchal vector basis functions 


There are presently two competing approaches to higher-order vector elements. 
One approach is interpolatory; in this case, a degree of freedom is typically asso- 
ciated with a tangential field at a specific point. The other approach is hierarchal, 
in which case a specific higher-order set contains all the lower-order basis func- 
tions.! For mesh refinement/enrichment purposes, hierarchal elements are very 
useful, and here we consider only the use of such elements, in particular those 
presented in [3]. (For a comprehensive discussion of interpolatory elements, see 
[5]. These elements can be used for h-adaptation, but are inconvenient at least 
for p-adaptation. We will discuss these topics in Section 10.9.) These elemental 
basis functions are summarized in Table 10.1, along with the number of degrees 
of freedom per tetrahedron and their respective associations with edges or faces. 
Webb presented the information slightly differently in his paper [3, Tables III, IV 


! Nodal elements can be both interpolatory and hierarchal; there does not appear to be a proof prohibiting a set 
of vector elements from having both properties. However, no such vector elements have yet been proposed. 
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Table 10.2 Comparison of various hierarchal element schemes (to LT/QN) 


CTILN, all 
Edge-based 1 per edge GVEj — CVG 
LT/QN, Savage [10] 
Edge-based 1 per edge VGisj) 
Face-based 2 per face bi(Cj VSR — &VE;) 


(and {j; i; k} but not {k; i; j}) 
LT/QN, Webb and Forghani [7] 
Edge-based 1 per edge V(Gisj) 
Face-based 2 per face Cio VE; 
(and {j; k; i} but not {i; 7; k}) 


LT/QN, Andersen and Volakis [11] 


Edge-based 1 per edge (fi — $j) x 
(i VSj — SVG) 
Face-based 2 per face bi(Sj VSR — SV Gj) 


(as for Savage’s elements) 


LT/QN, Webb [3] 
Edge-based 1 per edge V(Gitj) 
Face-based 2 per face CORY GE A GKV Gj — 26505 Vx 
for {i; 7; k; } = {1; 2; 3} and {2; 3; 1} 


After [4], ©2003 IEEE, reprinted with permission. 


and V]; here, the additional gradient-space functions required for the LT/LN and 
QT/QN elements have been explicitly written as gradients of products of simplex 
coordinates to highlight this functional dependence. Note that only the additional 
basis functions required are tabulated, to avoid repetition; i.e. the full second-order 
QT/QN set of basis functions will include all thirty listed. 

Webb’s approach is elegant in that one progressively enriches the curl space, 
and then the gradient space.” (Earlier proposals did not follow this approach.) For 
example, moving from CT/LN to LT/LN, one adds elements of the form V (éi Cc a) 
one per edge, which is clearly in the gradient space. (The curl of this is identi- 
cally zero.) This then gives a complete first-order approximation function. Mov- 
ing from LT/LN to LT/QN, an additional eight face-based degrees of freedom are 
added, giving twenty vector-based functions and degrees of freedom per tetrahe- 
dron. “Face-based” means that the degree of freedom is associated with the integral 
of the tangential field over the face. 

Many other hierarchal elements have been published, in particular of LT/QN 
order. Some of these are summarized in Table 10.2. This table should serve as a 
useful summary of some of the various elements in current use. Another recent 


2 The Whitney element is actually a special case; it includes elements of both. 
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contribution on hierarchal elements is the work of Sun and Lee [6]; they use a 
slightly different approach to construct the elements, but the resulting basis func- 
tions are very similar, although not identical, to [3]. Most of these (including those 
of Savage described above) can be seen as variants of the elements originally pro- 
posed by Webb and Forghani [7]. (Indeed, not only are these variants on a theme, 
they are also linear transforms.) A number are summarized in Table 10.2. Note that 
all the face elements exclude (arbitrarily) one possible combination of {i; 7; k}; 
this asymmetry has long been noted, and is required to avoid linearly dependent 
basis functions. 

These elements are generally constructed by “inspection,” using the properties 
of simplex coordinates, and the gradients thereof. Webb’s recent work is the most 
comprehensive and theoretically motivated development along these lines to ap- 
pear in the electrical engineering literature. It is worth investigating the properties 
of these elements a little further, since some of these are far from trivially obvi- 
ous. For instance, it is not immediately apparent why the higher-order hierarchal 
elements have degrees of freedom associated with edges, faces or in some cases, 
with neither of these (the “volume-centered” degrees of freedom). 


10.1.3 Properties of hierarchal basis functions 


For this, it is useful to return to some basic properties of these elements, as orig- 
inally laid down by Nedelec [8]. (It should be commented that not all vector el- 
ements which have been proposed satisfy his criteria, but those presently under 
discussion do.) Nedelec focussed on degrees of freedom, rather than basis func- 
tions; indeed, his original work simply states the necessary properties, rather than 
proposing actual basis functions. The degrees of freedom as he defined them are 
not unique,’ even for the lowest order (Whitney) element, although in practice the 
non-uniqueness is only a matter of a constant for the lowest order case and does 
not impact on the code at all. However, as seen in the previous section, a variety of 
different basis functions have been proposed for higher-order elements. 

This is rather cryptically implied in Definition 4 of Nedelec’s original work [8]. 
For “kth” mixed-order elements, the 6k edge-based degrees of freedom for 3D 
elements (3k in 2D) should be given by 


[i -tq dC, Vq © Pe-1 (10.1) 
a 


ui is a basis function and f is the unit vector along edge a. Py is the linear space 
of polynomials of degree < k. For the Whitney element, with k = 1, we see that 


3 The polynomial space described is, however. 
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q may only be a constant. In the case of this element, with (¢; V¢j; — ¢;V¢) form, 
this constant is often implicitly unity, and the associated Nedelec degree of free- 
dom (which may be viewed as located at the middle of the relevant edge, al- 
though this is not essential) is the tangential field on this edge. We commented 
earlier in Section 9.6 that it may be shown that the integral of the tangential 
component of the Whitney element along an edge is constant; we will now do 
this. 

This proof is rather simple. Integrating the Whitney element along an edge 
yields two integrals. The first is of the form 


[os (Ve; -f) dC (10.2) 


and the other has i and j interchanged, and is of opposite sign. Throughout the 
element, V¢j is constant, and it is perpendicular to the edge opposite node j (this 
was discussed in Section 9.6). Clearly, V¢j - ¢ is thus also a constant along any 
particular edge. Along the edge directed from nodes i to j, what remains is an 
integral of a simplex coordinate, varying linearly from 0 to 1, along the edge. The 
result is +1/2¢, with @ the edge length, and the sign depending on the direction of 
integration. Clearly, this is a constant, as is the other integral. Obviously, incorpo- 
rating additional constants, such as Nedelec’s g, changes only the final constant, 
which is irrelevant in practice. The result is as in Appendix A, 


i 


t (10.3) 


Eucaledpe = 

When Evanledse, is integrated along edge 7, the result is the well-known identity 
that the appropriate degree of freedom is the tangential field along edge i. 

Importantly, on the other two sides, one or the other simplex coordinate will be 
zero, and the other entirely normal to the edge. Thus Eq. (10.1) will yield zero for 
this term on the other two edges (due to the u - ¢ term). The argument is precisely 
the same for tetrahedra. 

Additional edge-based degrees of freedom, as required for Webb’s scheme for 
LT/LN order elements and higher, of the form 


VGioj) = GVSj + VG 


yield exactly the same result — they contribute additional degrees of freedom on 
edge {i; j} and nothing to the other edges. (Note that a different choice of g may 
be required in this case, otherwise the degree of freedom is zero. A linear function 
is an obvious possibility, such as a suitable Legendre polynomial.) 

Now, the face-based elements. Nedelec’s original definition of the 4k(k — 1) 
face-based degrees of freedom for higher-order elements of maximum (but not 
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complete) order k was 
[| ui xn-gds, VG € (Pe_2)* (10.4) 
f 


Here, 7i is the unit vector normal to edge f, and the polynomial g is now two 
dimensional (for k = 2, this must be a constant). Let us now see why these addi- 
tional degrees of freedom, which enrich the curl space for the LT/QN element, 
are associated only with faces. We will consider vector elements of the form 
Ci(Sj Ve — KV Ej); the Webb LT/QN enrichment in Table 10.1 is a linear com- 
bination of two such forms, so the argument includes these. On face i, j,k, one of 
the simplex coordinates will always be zero on each edge; for example, ¢; is zero 
on edge {j,k}, so these do not contribute to the edge-based degrees of freedom. 
(This extends to faces, e.g. ¢; is zero everywhere on face {j, k, /}. Hence this basis 
function will have no tangential projection on any other face.) 
Over face {i, j, k}, the degrees of freedom are thus 


[f ssive xi aas— ff acre xiy-Gas (10.5) 


The (Véx x f) and (Vé; x /) terms are constant over this face, as is g, and what 
remains are two standard integrals in simplex form, proportional to the triangle 
area and thus constant. 

The higher-order elements (quadratic tangential/cubic normal, QT/CuN, etc.) 
involve additional “volume-centered” degrees of freedom. These each involve 
products of all the simplex coordinates, so are clearly zero on all faces and edges. 
For these basis functions, the associated degree of freedom as defined by Nedelec 
is a weighted integral over the volume. 


10.1.4 Practical impact of higher-order basis functions in an FEM code 


The discussion in the preceding section may appear highly theoretical, so it is 
worthwhile summarizing the practical impact hereof. Finite element codes do not 
usually actually compute the degrees of freedom as defined by Eqs. (10.1) and 
(10.4), since this usually serves no particular purpose. The “degrees of freedom” 
for which an FEM code solves are usually simply the unknowns associated with 
each basis function; as we have seen, for the Webb elements (and most other prop- 
erly defined vector elements) these degrees of freedom can be correctly associated 
with edges, faces or the volume, and for the first two, the degrees of freedom are 
tangential field projections onto the edge or face, as required by Nedelec’s original 
work. To enforce field continuity correctly, a degree of freedom associated with an 
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edge or face must simply be shared between all connected elements; we discussed 
this in the context of edges in the preceding chapter. Note that edges have direc- 
tions; the numbering scheme used there ensured that the directions were consistent 
between elements, and one must do the same with faces. Volume-centered degrees 
of freedom have no projection on the edges or faces and hence are not shared by 
adjoining elements. 

A tricky problem which was surprisingly neglected in the literature until re- 
cently was the question of how to match fields when using hierarchal elements 
to an actual specified field, as required by a Dirichlet boundary condition, for 
instance. With interpolatory elements, this would be trivial, since each degree 
of freedom would, by design, correspond to a tangential field component at a 
point on each element. With hierarchal elements however, this is only uniquely 
defined for Whitney elements. Webb [9] has very elegantly addressed this issue 
for higher-order hierarchal elements using the elements in [3]. As we have seen, 
starting with the conventional Whitney elements, Webb’s elements enrich alter- 
nately the gradient and curl spaces. Webb exploits this in [9] to match alternately 
the tangential components of the electric field, and then the normal component 
of the electric field (the curl space). Since any such matching using hierarchal ele- 
ments is approximate, he uses a projective approach to improve the accuracy of the 
matching. 


10.2 The FEM from the variational boundary value problem viewpoint 


It is useful at this stage to introduce some further ideas from functional analysis, 
extending the introduction in Section 4.5. This approach is strongly influenced by 
the methods used in applied mechanics, and is based on a development presented 
by Botha [12]. It will be especially useful when error estimation methods are dis- 
cussed later in this chapter. 

Firstly, we define a bilinear form. If X and Y are vector spaces, a bilinear form 
B:X x Y — Cis an operator with the properties 


Bau + Bu, v) = aB(u, v) + BB(u, v), u,wexXx, vEeY 
Buu, av+ Bw) = aB(u, v) + BB(u, v), uexX,v,weY (10.6) 


with a and 6 complex numbers. In short, the operator G is linear in each of its 
“slots.” 

In the context of the high-frequency functional, the boundary value problem to 
be solved on domain Q, in terms of the electric field, is the vector wave equation 
with appropriate boundary conditions, either Dirichlet on "py or Neumann on ['y, 
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with of course the boundary (also called closure) = Tp +Ty: 


1 
VC = Vee kie-E = —jkoZo JonQ 
Mr 
ax E=0 on? D 
AxVxE=N on T'y (10.7) 
This is a “strong” version of the problem; a vector field E which satisfies the vector 
wave equation must be twice differentiable. 

Note that from one of Maxwell’s curl equations, the Neumann boundary condi- 

tion can also be written as 


Ax H = —/-N only (10.8) 
Te) 


Thus, the Neumann boundary condition can be seen equivalently as a constraint 
on tangential H. 

Using a method of weighted residuals approach, with an arbitrary testing func- 
tion W, and otherwise proceeding in a very similar manner to that of Section 9.2.4, 
it may be shown that the following is the “weak” representation of the boundary 
value problem represented by Eq. (10.7): 


If poe (V x W) — Keck W| dV 
-|f Jape Bh. (i x W)dS 
Tp Ur 


--|f |i. was — jkoZo ae J -WdS (10.9) 
ly Mr V 


with A x E=Oonlp 


A symmetry argument is used to establish that W must also satisfy the homoge- 
neous boundary condition on Ip, so that the surface integral over Ip on the left- 
hand side falls away. The final form of the variational boundary value problem is 


B(E,W)=L(W)  VWewW, Eew (10.10) 


The bilinear and linear forms are defined as 


B(E, W) = If [i x E)-(V x W) — kee E - i dV (10.11) 


L(W) = -ff - ra W dS — jkoZo If J-Wds (10.12) 
N r 
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The space in which the solution and testing vector functions lie is defined as 
W = {a € A(curl, Q)|n x d=0 on Tp} (10.13) 


This is the space of curl-conforming vector basis functions which we have already 
discussed, with the additional constraint of the homogeneous Dirichlet boundary 
condition. 

In this development, the Neumann boundary condition has been “absorbed” into 
the variational boundary value problem — via the first term on the right-hand side of 
Eq. (10.12). (It will be recalled that in Section 9.2.4, a similar result was obtained 
in the context of a homogeneous Neumann boundary condition.) The Dirichlet 
boundary condition must however be explicitly enforced via a restriction on the 
space W. (This sounds more complex than it is; as was seen in Section 9.7, this is 
implemented in practice by zeroing the prescribed edges.) 

With the variational boundary value problem established, one can then proceed 
to demonstrate that the stationary functional representation of the problem is the 
following: 


F(E) = SB, E) — L(E), Eew (10.14) 


This is the familiar curl-curl functional, which we used in the preceding chapter 
(although the linear term was zero for the eigenvalue problem). Note that this (and 
indeed the variational boundary value problem from which the stationary func- 
tional form is obtained) is known as a “weak” form; the differentiability require- 
ments on the solution space have been reduced (the function E need only be once 
differentiable now). 


10.3 A deterministic 3D application: waveguide obstacle analysis 
10.3.1 Introduction 


The analysis of waveguide discontinuities has been a canonical problem for an- 
alytical, approximate, and now numerical approaches since the pioneering work 
of Marcuvitz and colleagues during the Second World War, now some sixty years 
back. Using variational formulations, and quasi-static approximations of the fields, 
Marcuvitz et al. were able to analyze an extraordinary variety of problems, docu- 
mented in the classic text originally published in 1951 and now fortunately avail- 
able again [13]. Subsequently, mode-matching methods were introduced for the 
analysis of “stepped” discontinuities, — i.e. structures where the waveguide modes 
could be computed in a stepwise fashion, and matched at two-dimensional planes. 
However, for general, arbitrary discontinuities, and of course those involving non- 
metallic discontinuities such as dielectrics, differential equation based methods 
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such as the finite element method (FEM) and finite difference time domain (FDTD) 
method are now the methods of choice. In this section, we will first present the for- 
mulation for this, which also affords the opportunity to deal with the more general 
version of the curl-curl functional as discussed in Section 10.2, and then analyze a 
waveguide device using both a code developed by the present author, as well as a 
commercial FEM package. 


10.3.2 The waveguide formulation 


The formulation to be discussed is a straightforward extension of Jin’s approach 
[14], published by the present author in [15]. His formulation addressed two-port, 
single-mode analysis, with the waveguide oriented in the Z-direction. Here, gen- 
eral waveguide orientation(s) will be considered. The formulation assumes hollow, 
rectangular guide at the ports (although the extension to homogeneously filled 
guide is straightforward). The TEj;g mode is assumed in the following. In be- 
tween the ports, in the region to be discretized using finite elements, the waveguide 
may contain linear, inhomogeneous, lossy, dielectric and/or magnetic material(s); 
and/or conductors (for instance, posts or irises); and may change orientation (e.g. 
E-plane bends) or dimension (e.g. E- and/or H-plane steps). The formulation to 
be used does, however, assume isotropic media. The generalization of the analysis 
to multiple ports, the inclusion of higher-order modes, and the extension to more 
general waveguide, will be outlined subsequently. 


Formulation overview 


The key part of the formulation is to write the electric field at port 1 (S)) as the 
sum of the known incident and unknown reflected fields in terms of the (€, 7, €) 
coordinate system local to the port, with ¢ in the local direction of propagation, 
and set to zero at each port, as follows: 


E(é,n, 0) = E™(E,n, + EME, 0,0) 
= (Eoéo(E, noe /*105 + REoeio(E, mets*08)|e-9 (10.15) 


€10(&, n) is the relevant waveguide eigenmode (the TE;9 eigenmode here) and Key 
is the modal propagation constant. Note that it is necessary to retain the e FRe18 
term, even though the field is evaluated at ¢ = 0, since the boundary condition to 
be discussed involves the derivative of the field, which must be evaluated before 
setting ¢ = 0. 

The next key element of the formulation is to convert Eq. (10.15) to a boundary 
condition of the third type involving both the field and its normal derivative. Such 
boundary conditions can be incorporated in the bilinear functional, as will be seen 
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shortly. The detail is given in [14, Section 8.5], briefly, the result is: 
ax(Vx E)+yax (ix E) =U (10.16) 
with 
V = jhe, Ui = —2 ke, E” (10.17) 


It should be noted that, in obtaining Eq. (10.16), the transverse only nature of 
the TE field is exploited. TM modes contain axial E field components, and the 
boundary condition cannot thus be written for an E field solver. TM mode analysis 
could be undertaken by using an H field solver. 

The same is repeated at port 2, but at that port, there is only an unknown trans- 
mitted field: 


E(é,n,¢) = E™(E, 0,6) 
= T Epéio(&, nye 1*5105 |p—o (10.18) 


Similar comments apply as regards the e /*i0§ term. The boundary condition at 
port 2 is 


ax(VxE)+yax (ix E) =0 (10.19) 


In Jin’s original formulation, the phase was referenced to each port. In the 
present formulation, the transmission coefficient T incorporates the “insertion” 
phase, i.e. for a section of empty guide length ¢, T will have phase angle —k,,,€. 
This produces the same phase that would be measured using a vector network an- 
alyzer, with reference planes calibrated at the ports. 

The equivalent variational functional (assuming isotropic but possibly lossy ma- 
terials), subject to these boundary conditions on the ports and Etan = 0 on the per- 
fectly conducting walls, is: 


FE =3 [ff [0 xB xB -Rek- él dv 
+ ff Gab @x B+ 8-0") dS 
+/f [SG x B)- Gx 8) dS (10.20) 
So 2 


This can be obtained from the development in Section 10.2. In this case, in the 
Neumann boundary condition of Eq. (10.7), repeated here, 


AxVxE=N on Ty 
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the vector function N is Vine — yn x (nx E ); this is substituted into the linear 
operator £ of Eq. (10.12), and a vector identity is used to shift one of then x n x 
operators to the weighting function. 

For readers interested in the details of the finite element discretization of this 
functional, [14] and [15] are recommended. 


Computation of the S-parameters 


The above formulation produces R and T for port 1 (S11 and $2). It must be 
repeated with an incident field at port 2 to obtain S12 and $22. Only the excitation 
vector changes, so this is simply a question of repeating the matrix solve. For 
multiple ports, the extension is obvious: T is computed at each port, producing 
one column of the S matrix. The excitation is then repeated at each port to produce 
other columns. 

The S-parameters may be computed directly from the fields on the ports. A 
more accurate approach uses the orthogonality of the modes to integrate the fields 
computed over each port [14, Section 8.5]; as an example, for two ports the trans- 
mission coefficient is given by: 


2 
abEo 


If. E(E,n, 0) + @o(&, 0) dS (10.21) 


As before, €19(&, 77) is the relevant waveguide eigenmode; a and b are the waveg- 
uide dimensions. 


The waveguide formulation: another perspective 


The formulation can be viewed as a finite element/boundary integral (FE/BI) for- 
mulation, using the waveguide Green function for “exact” mesh termination. (For 
radiation or scattering problems, FE/BI formulations use the free-space, or some- 
times the half-space, Green function, e.g. [14, Section 10.4]; this is discussed later 
in this chapter.) The current dominant-mode-only analysis uses only the first in 
the infinite series of modes comprising the waveguide Green function. It is accu- 
rate provided that the ports are sufficiently far removed from the discontinuities 
(assuming, of course, that only the dominant mode is above cut-off). Higher-order 
modes are easily included in the formulation; this does require re-computing both 
the left-hand side matrix and right-hand side vector, since the former has one term 
dependent on the propagation constant, and the latter is obviously dependent on 
the incident mode shape. The formulation presently assumes hollow waveguide at 
the ports, i.e. only TE (and TM modes, if an H field solver is also implemented) 
are included. More exotic modes, or numerically determined ones, could also be 
incorporated into the formulation. 
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10.4 Application to two waveguide discontinuity problems 


With the formulation in hand, we will now proceed to analyze two waveguide 
discontinuity problems. The first demonstrates multi-port analysis; the second 
demonstrates the use of complete vector basis functions. The latter is based on [4]. 


10.4.1 Application to a Magic-T 


Introduction 


The “Magic-T” (see Fig. 10.1) is a 180° hybrid. Such devices are four-port 
structures, with the following interesting properties. A signal applied to port 1 
is evenly split into two in-phase components at ports 2 and 3, and port 4 is 
isolated. Conversely, a signal applied to port 4 is evenly split, but with 180° phase 
difference, between ports 2 and 3, and port 1 is isolated. It can also be operated 
as a combiner, in which case when input signals are applied to ports 2 and 3, the 
sum appears at port | and the difference at port 4. Ideally, the S-parameters of the 
device are [16, p. 402]: 


i ee ee 

-j/1 0 0 -1 

27 au 10.22 

I= Flt 0 0 1 aaa 
lo ag) “Ar 40 


Port 1 


=> 


Figure 10.1 The Magic-T hybrid waveguide junction. 
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The 180° hybrids can be made in various ways, e.g. microstrip or stripline, the “rat 
race” being a very popular implementation in planar technologies. The Magic-T 
is a waveguide implementation of a 180° hybrid [16, p. 411]. 

This is an example of a structure where the approximate analytical techniques of 
Marcuvitz et al. are unable to provide useful data and a full-wave solution becomes 
imperative. (A complex equivalent circuit is presented for the Magic-T [13, p. 386], 
but only measured data at one frequency point are provided.) The behavior of the 
waveguide Magic-T departs very significantly from the ideal of Eq. (10.22), as will 
be seen. 


Setting up the problem 


The setup procedure for the junction as shown in Fig. 10.1 illustrates a number of 
features one would expect in any RF FEM code. The specific code described in 
detail is FEMFEKO, which is an experimental FEM code using FEKO-like input 
and output files, but not available for general use at the time of writing. However, 
the meshing is done using a commercial FEM mesher, FEMAP. In most pack- 
ages, this type of structure is straightforward to model; in FEMAP, for instance, 
the Solid modelling options are the easiest. First, one 40mm long section of 
X-band (22.86mm x 10.16mm) guide is generated (as a solid); then the other 
40 mm section (at right angles to the first) is added. The structure is then meshed, 
using the meshing commands within the package. Following this, the mesh is then 
export-ed as a neutral file, from which it can be used by various analysis pack- 
ages. (This last step is of course unnecessary in integrated FEM packages incorpo- 
rating mesher and solver.) 

Boundary conditions must then be applied to the structure. “Port” boundaries 
are required at the four ports of the device, a port corresponding to the region 
where the modal boundary condition of the preceding theoretical discussion is 
applied. In FEMFEKO, a port requires two vectors to define it. The first defines 
the outward directed normal on each port. The second defines the relative “sense” 
of each port; there is an ambiguity regarding the “sense” of the ports, which this 
helps resolve. The problem is that for a straight section of waveguide, it is obvious 
that the sense of each port should be the same, either up or down, but for a bent 
section of guide, the sense is essentially arbitrary. For instance, the tangent vector 
defining the positive modal sense on port 4 could equally well be chosen as +7 or 
—y. (For the results presented, the former was chosen. On ports 1,2 and 3, +Z was 
chosen as the tangent vector.) Various packages deal with this issue in different 
ways. 

This problem was also solved using a commercial package, ANSOFT’s HFSS 
code. Constructing the finite element model is very similar to the procedure 
described above, but since the mesher is integrated within the package, it is 
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appreciably more user-friendly. Nonetheless, the requirement of correctly speci- 
fying boundaries in particular rests with the user. HFSS meshes the structure au- 
tomatically, and then refines the mesh until a user-specified level of accuracy is 
reached (usually, a negligible change in S-parameters from one iterative pass to 
the next). 

For the results to be presented, the geometrical primitive cubes which defined 
port 1 had a length of 40 mm (i.e. approximately 20 mm of guide from the junc- 
tion), for ports 2 and 3 they were 30mm (also approximately 20mm of guide) 
and for port 4, 30 mm (again, approximately 20 mm of guide). These lengths were 
based on the results for other waveguide structures; the requirement is that there 
be sufficient length to allow evanescent modes to die out before the ports. As in 
our 2D eigenvalue problem, the waveguide was an X-band guide with dimensions 
22.86mm x 10.16 mm. 


Results 


This geometry has no simple analytical solution, as already discussed. To obtain 
data to compare with these results, ANSOFT’s HFSS code was used to generate 
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Figure 10.2 S-parameters of the Magic-T for port 1; magnitude. 
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Figure 10.3 S-parameters of the Magic-T for port 1; phase. 


another FEM solution of the problem. The HFSS model was identical in size to the 
FEMAP (and FEMFEKO) models, so that phase results could also be compared. 

The S-parameter data for port 1 are presented in Figs. 10.2 and 10.3. Port 4 
is indeed isolated; since $4; is very small, some discrepancy between the 
FEMFEKO and HFSS results is to be expected. Ports 2 and 3 show equal, in- 
phase, power splitting. Note, however, how far S;; departs from the theoretical 
ideal of no reflection at port 1. A brief consideration of the problem shows that 
this is not unexpected, since the waveguide fed by port 1 sees two identical waveg- 
uides in parallel at the junction (those connected to ports 2 and 3). Thus mismatch 
of around 1/3 (about —10 db)* is to be expected. We see that the actual reflection 
coefficient (as computed) is worse than this. 

The phase data computed by HFSS for ports 2, 3 and 4 originally had a 180° 
phase difference compared to the FEMFEKO results, due to the mode sense ambi- 
guity discussed above. (HFSS has an option to define the mode sense, but this was 
not used.) This has been corrected in the results shown. 


4 The reflection coefficient of a system with a load equal to half the characteristic impedance. 
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The HFSS data used 1458 tetrahedra; the FEMFEKO result, using LT/QN ele- 
ments, used 802 tetrahedra (with an average mesh length of around 6.5 mm). HFSS 
refines its solution using adaptive meshing techniques, so one has reasonable con- 
fidence that the results are accurate. 


Conclusions 


This discussion has demonstrated the application of two FEM codes to a Magic-T 
hybrid, a device whose complex geometry precludes approximate analytical solu- 
tions. Higher-order elements were used and very good results obtained. It was also 
shown that the device’s performance (certainly in terms of S,) departs signifi- 
cantly from the ideal found in textbooks, highlighting the importance of numerical 
simulation as a valuable tool in microwave engineering. 


10.4.2 Application to a capacitive iris 


A rationale for complete basis functions 


In Section 10.1, complete vector basis functions were introduced, although little 
motivation was given for their use. The work of Webb is particularly useful in 
this context; [3] comprehensively discusses the motivation for both mixed-order 
and full-order elements. The main thrust of the argument can be summarized as 
follows: the variational functional which is rendered stationary by the finite ele- 
ment procedure consists (at its simplest) of two terms, one related to the curl of the 
electric field and one related to the electric field itself. (This discussion assumes 
the electric field is the working variable. The magnetic field can of course also be 
used.) The curl of the electric field is the time rate of change of the magnetic field. 
As already discussed in Section 10.1, the rationale behind mixed-order vector ele- 
ments is to remove terms from the polynomial approximation of the electric field 
which do not contribute to the magnetic field. In problems where the electric and 
magnetic fields are of more or less equal importance, it makes sense only to use 
the polynomial terms which contribute to both fields, to obtain maximum accuracy 
for a given number of degrees of freedom. 

However, there are a number of problems of interest in RF engineering where 
the fields are dominated by either electric or magnetic fields. In general, a sharp 
edge will result in a singularity in both the electric and the magnetic fields, but 
for certain field and discontinuity orientations, such as the capacitive iris problem 
to be discussed, the singularity is in the electric field alone, and hence the field is 
dominated by the quasi-static electric field behavior. Hence it can be expected that 
full-order elements should be useful for such problems. 
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Figure 10.5 Results for a capacitive iris, compared with Marcuvitz’s result, as a function 


of (inverse) mesh size. After [4], ©2003 IEEE, reprinted with permission. 


Results 


Here, a capacitive iris is considered.> The metallic iris, shown in Fig. 10.4, is half 
the height of the waveguide, and again, the analysis is performed at X-band. The 
results shown in Figs. 10.5 and 10.6 were computed at 8.25 GHz, towards the 


5 This example was first published as [4]. 
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Figure 10.6 Results for a capacitive iris, compared with Marcuvitz’s result, as a function 
of degrees of freedom. After [4], ©2003 IEEE, reprinted with permission. 


bottom end of the X-band frequency range. A number of different meshes were 
generated for the problem; the average edge length in the mesh varied from around 
h © i, /6 for the coarsest mesh to h © 1,/25 for the finest. 

Of interest here are the excellent results for the polynomial complete QT/QN 
elements, which agree very well indeed with Marcuvitz’s (approximate) results 
[13]. (Marcuvitz’s models actually give equivalent circuit parameters. A discus- 
sion of how to convert these to S-parameters may be found in [15].) In the re- 
gion 4b/A, < 1, which is the case at this frequency in X-band waveguide, the 
error bound on Marcuvitz’s results is given as within 1%, a result verified by this 
QT/QN FEM solution. It is clear that LT/QN elements converge very slowly to 
the correct solution for this problem. A commercial FEM code using conventional 
mixed-order elements also produced unconverged results for this problem, despite 
incorporating adaptive mesh refinement techniques. 

In retrospect, it is clear that this is an especially difficult problem for general- 
purpose finite element solvers. Even with quite a fine mesh overall, it is likely that 
the mesh above the iris may only be two or three elements “thick” (this could be 
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improved by manual “seeding” ). This of course is precisely the direction in which 
the field is changing most rapidly, and furthermore, the electric field is strongly 
dominated by the quasi-static field with singular behavior. To describe this field 
adequately, one would expect to need full-order elements, thus also approximating 
the gradient space as accurately as possible (for the given maximum element order 
available). 

It is also of interest to note that the relative improvement of the QT/QN ele- 
ments compared to the LT/QN ones appears more marked than the improvement 
of LI/LN over CT/LN. It is quite possible that the mesh in the vicinity of the iris 
(as discussed above) is limiting the performance of the linear elements — indeed, 
only the finest mesh in the above results had three elements “thickness” above the 
iris. 

The above results, using S-parameters, concentrate on what are essentially inte- 
grated field quantities (also known as observables). It is also of interest to examine 
the actual field behavior in the vicinity of the iris. In Figs. 10.7 and 10.8, the 
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Figure 10.7 Vertically directed electric field along a line in the center of the guide, directly 
above the iris. Coarse mesh, h © 1, /6. After [4], ©2003 IEEE, reprinted with permission. 
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Figure 10.8 Vertically directed electric field along a line in the center of the guide, directly 
above the iris. Fine mesh, h © 1/20. After [4], ©2003 IEEE, reprinted with permission. 


vertically directed electric field (the field component aligned with the TE;9 mode 
electric field) on a vertical line directly above the iris is plotted. (The modal exci- 
tation Eo, see Eq. (10.15), at the port was 1 V/m, in this and subsequent plots.) The 
cut-line is located in the center of the width of the waveguide, shown in Fig. 10.4 
by the dashed line. The half-height iris runs from 0 to 5.08 mm; the figures show 
the field from 5.08 to 10.16 mm, the roof of the guide. The superior performance of 
the QT/QN elements is clear in these figures; even in the fine mesh case, Fig. 10.8, 
the CT/LN results are poor, and evidence a considerable (and non-physical) dis- 
continuity at around 7.5 mm. The LT/LN results are close to the LT/QN results, and 
the discontinuity evidenced by the CT/LN results has gone. The QT/QN results in 
both cases give the largest field value at the iris, indicating superior modelling of 
the field in this case. 

Some further comments here, especially on the CT/LN results, are called for. 
In the coarse mesh case, the mesh generator produced only one row of elements 
above the iris; for the finer mesh, it produced two. In Fig. 10.7, the cut-line ran on 


358 More advanced topics on the FEM 


the boundary of an element, hence the uniform CT/LN result is to be expected. In 
Fig. 10.8, the cut-line went through four elements (the mesh was not symmetrical 
about the center-line), hence the four distinct and different values on the plot. In 
both these figures the CT/LN results are plotted only at the points where the field 
was computed, to avoid an incorrect linear interpolation being imposed by the 
plotting program. 

It might be argued that this comparison is unfair, since obviously the QT/QN 
solution involves many more degrees of freedom than, for example, the CT/LN 
solution on the same mesh. This is not so in this case. The CT/LN results shown 
in Fig. 10.8 used 5523 degrees of freedom; the QT/QN solution in Fig. 10.7 used 
1302, and the solution quality of the latter is clearly far better than that of the 
former, for fewer degrees of freedom. (The issue of the potentially slower con- 
vergence of the higher-order elements will not be considered, since appropriate 
preconditioners can rectify this problem [6].) 


Discussion 


This capacitive iris problem has clearly highlighted the utility of full-order ele- 
ments for problems where quasi-static electric fields dominate the solution. Fur- 
thermore, electric field results for this problem have demonstrated that full-order 
elements can provide enhanced field modelling for a similar (or sometimes even 
smaller) computational effort in situations where the field itself, rather than an in- 
tegrated quantity such as the transmission or reflection coefficient, is of primary 
concern. An interesting idea is to consider how finite element solvers might auto- 
matically identify the appropriate element type in different regions; some prelim- 
inary results show promise [12, Chapter 6; 17]. Work has also recently appeared 
on independently controlling the gradient and rotational polynomial orders [18]. 


10.5 Hybrid finite element/method of moments formulations 
10.5.1 Introduction 


As we have seen, finite element formulations offer powerful methods for the nu- 
merical solution of electromagnetic fields in inhomogeneous media. The major 
drawback for high-frequency simulation is the requirement for terminating the fi- 
nite element mesh as a finite distance. Various mesh termination schemes have 
been proposed and implemented, including mathematical absorbing boundary con- 
ditions — requiring special treatment of “boundary” elements — and more recently, 
perfectly matched layers. In Chapter 3, we studied the application of both these 
methods within the context of the FDTD, and these methods have also been used in 
FEM approaches. In this section, we will instead consider an “exact” termination 
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scheme, which effectively uses the method of moments applied on the open bound- 
ary to terminate the FEM region, producing the FEM/MoM hybrid method. (This 
method is also sometimes called the boundary element/finite element method, or 
boundary integral/finite element method). 


10.5.2 Theoretical background 


Before addressing the theory of the FEM/MoM hybrid method, a connection be- 
tween the Rao—Wilton—Glisson (RWG) element [19], widely used in MoM formu- 
lations, and the Whitney (CT/LN) element which we have discussed extensively 
here needs to be highlighted. Much earlier, in Chapter 6, it was commented that 
the RWG element [19] and the Whitney element are intimately connected. The re- 
lationship is the following: by simply taking the normal crossed with the Whitney 
element, the RWG element is obtained. It will be recalled that the Whitney ele- 
ment is also sometimes called “curl conforming”; the RWG element is an example 
of a “divergence conforming” element. (Nedelec’s original work also considered 
such elements, although the RWG element was derived independently.) This close 
relationship is fortunate and not by any means serendipitous: the underlying re- 
quirements of field continuity are the reason for the close relationship. This is an 
important practical point, because it implies that edge-element FEM codes, with 
volumetric fields as unknowns, and RWG-based MoM codes, with surface currents 
as unknowns,° can at least potentially conform on a boundary. 

With this background, we can now consider the FEM/MoM formulation. The 
following is based on the presentation in [1, Chapter 9]. Within a region Q, 
with closure (bounding surface) S, and free space in the exterior region, a finite 
element discretization of Maxwell’s equations, via the stationary functional as in 
Section 10.2, results in the following matrix equation: 


[A]“ {e} + [B]" {h}s = {c}* (10.23) 


In this equation, the superscript FE indicates that the E field has been chosen as 
the main working variable. Matrix [A] is the usual FEM matrix obtained from 
the bilinear functional applied throughout the volume; vector {e} is the vector of 
unknown coefficients of the electric field in the volume; matrix [B] represents the 
Neumann boundary condition applied on the surface;’ and vector {h} is the vector 
of unknown coefficients of the magnetic field on the closure. Finally, vector {c} 
accounts for current sources internal to the volume. Specifically, the elements of 


© Recall that an equivalent surface current is obtained from the normal crossed with the appropriate tangential 
field component. 
7 Recall Eq. (10.8) and the connection with the tangential magnetic fields. 
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each are given by 
AR = [uct x Nj). (V x Nj) —Ke-N; - Nj} dQ, 
Viandj=1,...,N (10.24) 
BES itn g Wi (Nj x a)dS, 
Vi=1,...,N, j=1,...,Ns (10.25) 
cf = i: Ni CiknJ™ +9 x (az hKi) dQ, 
Fetes (10.26) 


In the above expressions, N; and N ; are the element shape functions. The ele- 
ments of [A] are immediately recognized as the [$] and [7] matrix elements dis- 
cussed in Section 9.7 (albeit now three- rather than two-dimensional), for which 
closed form expressions are available. int and Kint represent sources internal 
to Q. 

The problem is clear: there are N + Ns degrees of freedom (N unknowns in 
{e} and a further Ns unknowns in {h}5, the latter is the H field on the boundary 
surface S). An additional constraint is required to connect the surface magnetic 
fields with the volumetric electric fields (which also of course exist on the closing 
surface). In the waveguide formulation, knowledge of the modal structure of the 
field was sufficient, but now a further matrix equation must be derived in terms of 
the surface fields. 

Deriving essentially the EFIE and MFIE, one can obtain the following, suitable 
for an MoM representation on the boundary $:° 


E(7) = EF) +f (v x GF, 7’) - {i x Es} 
a 


~ jknG@, 7") - {al x Hs@")}) dS’ (10.27) 


and 
H(7) = H'™ (7) +f (v x GG, 7") - {i x Hs(F")} 
S 
jk Ay? Ay =) / 
+“°GO,7) «(i x Es ») dS’ (10.28) 
n 


Note that the S subscript refers to quantities on surface S, not the scattered field. 
G is the dyadic free-space Green function, and i’ is the outward directed normal. 


8 Also known as a Huygen’s integral representation. 
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Writing these in a more compact notation, one obtains 
—~E + L5\(E, x a’) + L5,(A, x a’) + E™@) =0 (10.29) 

Using a Galerkin procedure, this may be discretized as 
[B]"{e}s + [P]"fels + LO] (h}s + fy}" =0 (10.30) 


The matrix [B]” in the above is of the same form as [B]® in Eq. (10.25): the 
only difference is that the constant term is — jk /7 instead of jkn [1, pp. 408-409]. 
The other matrices are given by 


k ad o> 
pe io fi {L3,(Nj x fi) x alas (10.31) 
nJs 
E wk NT Soa A K 
OF = j- Ni - {LE(N; x a) x alas (10.32) 
nJs 
E kK N7 mince A 
Yj =e NE x n)dS (10.33) 


The matrix size of [B]” is N x Ns, but for the boundary element terms in 
Eq. (10.30), only the relevant Ns x Ns submatrix is retained, so that the above 
matrix equation (10.30) is of dimension Ns. Similarly, Eq. (10.28) can be dis- 
cretized to yield 


[Bl {h}s + [PI {h}s + [01 {e}s + {y}“ =0 (10.34) 


Either Eq. (10.30) or (10.34) is sufficient to eliminate {}5 in terms of {e}s, 
which is then substituted into Eq. (10.23). (Note that {e}s5 C {e}, since these are 
just the components of electric field on the surface.) 

The [P] and [Q] matrices are not straightforward to compute, since they involve 
integrals of Green’s functions, containing integrable singularities, acting on the 
basis functions; see [1, p.413]. (As we saw in Chapter 6, this is standard in MoM 
formulations involving a rigorous surface current treatment.) The case of a cavity 
in a conducting half-space has been worked further by Jin [14, Chapter 10]; his 
results are also summarized in [1, Chapter 9]. For more general problems, see [14, 
Chapter 10; 20, Chapter 11; 21, Chapter 7]. 

A computational problem which emerges is that the resulting system of lin- 
ear equations is overwhelmingly sparse, but contains a dense submatrix repre- 
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senting the MoM (BEM) interactions. The overall matrix is also not, in general, 
symmetric. 


10.6 An application of the FEM/MoM hybrid — GSM base stations 
10.6.1 Applications of FEM/MoM hybrid formulations 


The hybrid FEM/MoM formulation outlined above is applicable to many prob- 
lems. In general antenna analysis, the FEM is not the method of choice for wire 
antennas, where the standard MoM formulation provides a straightforward and 
robust solution. However, when such antennas are radiating in the presence of 
electromagnetically penetrable bodies, the FEM/MoM hybrid comes into its own. 
Modelling the interaction of operators and personal communications systems, in 
particular cellular phones, has emerged as an important field of application of this 
formulation, and the example presented here is a variant on this theme. However, 
there are a number of other applications, which will now be outlined. 

Cavity-backed antennas were one of the first applications of the FEM/MoM 
(BEM) hybrid formulations, see [14], and they continue to attract interest [21]. 
Although the original formulation assumed that the cavity was recessed into an in- 
finite ground plane, recently work has extended this to cavities on elliptical shapes, 
permitting analysis of conformal airborne antennas. Microstrip antennas have also 
been efficiently analyzed using this approach; since the substrate, which is dis- 
cretized with the FEM, need not be uniform in this approach, some interesting 
work has been done on the use of perforated substrates (a type of electromagnetic 
band-gap material) to reduce mutual coupling [22]. An important class of cavity- 
backed antenna is the spiral, both Archimedes and logarithmic. Again, stratified 
media MoM codes assume infinite planar media, whereas an FEM/BEM formula- 
tion need not. 

General FEM/MoM hybrids also permit the analysis of microstrip antennas, 
removing the assumptions of infinite substrate and permitting the effect of edge 
diffraction to be studied. However, this is computationally quite expensive. 

The use of CEM tools in what are often EMC problems can be problematic, 
due to the great complexity of the systems. Work by Hubing’s group has proposed 
the use of the FEM for regions of geometric and material complexity, combined 
with a MoM treatment of the interconnects [23]. Work has also been done on the 
coupling of energy through deep slots using FEM/MoM hybrids. 

Inhomogeneous objects buried in stratified media are another interesting appli- 
cation; perhaps the most obvious candidates here are landmines and other unex- 
ploded ordnance. The formulation required becomes extremely complex, since the 
“exterior” Green functions involve the Sommerfeld potentials. Eibert and Hansen 
present the necessary formulation in [24]. 
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10.6.2, Human exposure assessment near GSM base stations 


The widespread adoption of personal communication devices, in particular mobile 
(cellular) telephones, during the 1990s and the continuing growth in the present 
decade presented significant challenges for CEM analysis. When first introduced, 
there were widespread concerns over safety issues associated with the widespread 
and prolonged use of mobile handsets, perhaps triggered by the term “radia- 
tion.”? After much research, it would appear that these concerns were fortunately 
unfounded, due primarily to the low power levels of the handsets. However, 
a case where there are indeed valid concerns for health issues is that of base 
Stations, due to the much higher power levels encountered there (60 W is typ- 
ical) and the requirement for maintenance workers to operate close to the 
antennas. 


An aside — mobile telephony 


Mobile telephony has been one of the most extraordinary technical success sto- 
ries over the last decade. In many countries, in particular outside the First World, 
the number of mobile telephones now exceeds the number of fixed lines, and 
the Group Special Mobile (GSM) standard, originally operating at 900 MHz 
and now also 1800 MHz, has proven wildly popular everywhere apart from the 
USA. Indeed, at the start of 2004, figures from the International Telecommuni- 
cation Union indicate that the number of mobile subscribers worldwide — 1.14 
billion — has just overtaken the number of fixed-line subscribers, at 1.1 billion. 
When one considers that the current fixed-line infrastructure has been under de- 
velopment for the better part of a century, compared to that of less than one 
decade for cellular phones, this is an extraordinary and largely unheralded tech- 
nical achievement, compared to the Internet, for instance. It has had a major 
impact on the lives of many people in less wealthy countries, who would other- 
wise no doubt still be waiting for a fixed line, frequently provided by parastatals 
with very limited capabilities. 


The FDTD, FEM with ABC, FEM/MoM and also volume equivalence principle 
MoM formulations have all been used successfully for the analysis of human expo- 
sure assessment of radiation from handsets. However, for base stations, one has the 
problem both of complex wire antennas, typically mounted on a mast, and consid- 
erable distance between the human phantom and the antenna. Figure 10.9 shows an 


9 Tt must say something of human nature that a number of users who express such concerns are prepared to 
operate their mobile phones while driving, a well-known and much documented hazard, and illegal in many 
countries for precisely this reason! 
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Figure 10.9 Near-field base-station exposure setup. After [25], ©2003 IEEE, reprinted 
with permission. 


example of a typical setup. (Handsets are usually analyzed in very close proximity 
to the head, which has been the major health concern.) Although this was not dis- 
cussed in our theoretical development, a very powerful feature of the FEM/MoM 
formulation is that the exterior region may also contain scatterers/radiators, it 
need not be purely free space, as shown in Fig. 10.10. These scatterers and radia- 
tors are treated with the MoM in a self-consistent manner. 
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Figure 10.10 Hybrid FEM/MoM problem setup. After [25], ©2003 IEEE, reprinted with 
permission. 


Meyer has implemented an outward-looking FEM/MoM hybrid (using a number 
of FEKO routines, as well as elements of the FEM code discussed in Section 10.4), 
and in [25], results are shown for base-station exposure assessment in terms of IC- 
NIRP!° guidelines. In that paper, results are also shown for careful validation of 
some smaller problems, using both an FDTD code and FEKO; readers are referred 
to the paper as a good example of this process for complex problems. Here, results 
for only the FEM/MoM hybrid will be shown. Of particular interest are expo- 
sure results for particular organs, shown in Figs. 10.11 and 10.12. As discussed 
in [25], this particular problem could not be analyzed in any way other than the 
FEM/MoM, since it was electrically too large for both the FDTD and the MoM 
volume equivalence principle. 


10.7 The time domain FEM 


Time domain finite elements are widely used in other fields of engineering, but 
have not seen especially widespread use in CEM. This probably reflects both 
the technological driving forces behind the development of CEM, which until 
the 1980s emphasized the development of frequency domain formulations (since 
most RF communication and radar systems were inherently narrowband), as well 
as the competing algorithm in the time domain, the FDTD, which is so firmly 


10 Tnternational Commission on Non-Ionizing Radiation Protection. 
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Figure 10.11 Average specific absorption rate (SAR) in different body organs compared 
with whole-body (0.08 W/kg) and spatial-peak (2 W/kg) ICNIRP basic restriction, x- 
direction (transverse across antenna), for the base-station/half-body problem shown in 
Fig. 10.9. Prag = 60 W. Front of base-station antenna at y = —0.428 m and to-center of 
phantom head at y = 0 m. Adapted from [25], ©2003 IEEE, reprinted with permission. 


established in CEM and has produced so many excellent results that it is difficult 
to “sell” another time domain formulation. Nonetheless, the finite element time 
domain (FETD) method has seen a considerable amount of work and development 
in CEM over the last decade. In particular for devices with fine geometrical detail, 
it can be expected to emerge as a competitor for specialized applications. Perhaps 
the most interesting use is as a hybrid form with the FDTD, which exploits the 
superior geometrical modelling ability of finite elements with the robustness and 
speed of the FDTD method; no commercial implementation is presently available, 
nor is likely to be for some time, but recent research has produced good results 
[26, 27]. In a book which is otherwise devoted to methods already implemented 
in widespread public domain and commercial codes, coverage of this method may 
seem slightly anomalous, but at least one interesting point which emerges is a more 
general view of the FDTD method, which is actually a special case of the general 
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Figure 10.12 Average SAR specific absorption rate in different body organs for the z- 
direction (along antenna length), for the base-station/half-body problem. Details as in 
Fig. 10.11. Adapted from [25], ©2003 IEEE, reprinted with permission. 


FETD formulation, and furthermore, this is a method which we can expect to see 
more of in the future. 


10.7.1 Basic formulation and implementation 


Basic finite element formulation 


The following formulation is based on the second-order (curl-curl) wave equation 
approach, presented in [14, Section 12.1]: 
1 > as, a=, a>. F 
Vx av x Er, | + rv t)+ eee, th= vee t), rev 
(10.35) 
The boundary condition is 


1 2S 0 A A > aes = 
we av x EG, | +Y= E xAx EG, | —U@,t), FES (1036) 
in 
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Y is the surface admittance of the boundary, 71 is the outward unit normal to S, and 
U is a known quantity representing the boundary source (if present). 
sta corresponding weak-form solution of the boundary value problem is given 


an Vx NG)-Vx EG, ee oe. t) 


+aN;,(r)- “EG. t)+N,(r) - “ie, nf av 


+ // {y [a x Ni (F)] - Fe x E(F, | + N;(7)- Gm as =0 (10.37) 
S 


The electric field is expanded as 


N 
E@,t) =) uj(ONj@) (10.38) 
j=l 
with N the total number of unknowns, and N j (r) the usual vector basis functions. 


Substituting this into Eq. (10.37), the following partial differential equation is 
obtained: 


re tu) + ([R] +[0D—(u) + [51 {u} + {f} = {0} (10.39) 
In the above, {uv} = [w1,u2,..., un]; and the matrices are given by: 
T;j = Ly eNi(F)-Nj@)dV (10.40) 
Rij = [[[e%i@ x@av (10.41) 
Qi; = If Y[A x Ni@F)]- li x Nj@)]dS (10.42) 
Sij = IL “Iv x Ni (F)|-[V x Nj@]dv (10.43) 


and { f} is a column vector given by 
aces 0 Shas, oad -y Peis 
fi= // N,(r) - Bee thdV +f Ni(r)- U(r, t)dS (10.44) 
V S 


Equation (10.39) is an ordinary differential equation in the time domain and can 
be solved used a direct integration or finite difference method. 

Before departing from this, it should be commented that these equations 
are essentially identical in form to those arising in standard frequency domain 
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formulations, and the matrices already computed within a typical finite element 
frequency domain code can be largely re-used. 


Time integration 
For the time domain discretization, the Newmark-8 method is used. (An outline of 
the derivation of the method is given in Appendix B.) The equation to be solved at 
each time-step is the following: 


1 T 1 T. S n+l 
{cael I+ x7lTol + Bl i} wn 
2 n 
= {asst a¢= 2ets} {uw} 
tr} tr S]} {uy"! 
-{aat aay ol + Bl i} wn 
— [etry + = 2p)cfy + BLA] (10.45) 
with 
[To] = [R] + [2] (10.46) 


This can be more conveniently written as 


Au}! = [Byuy" + [clay — [pceyt! +0 — 2er}" + BLY] 
(10.47) 


Clearly, the solution of this is: 


(uy! = [A]! ([Buy" + EC]! — [pUpy t+ 0 — 2} + BLY" ]) 
(10.48) 


The matrix [A] is time invariant, and may be factored once, each time-step re- 
quiring then just a backward and forward substitution to establish the next solu- 
tion vector {uyrt!, With 6 > 0.25, the method is unconditionally stable, i.e. the 
Courant limit does not apply. 


10.7.2 Preliminary results 


To test the time domain formulation, propagating a plane wave through the mesh 
is usually a good initial test, since one has a simple analytical solution to com- 
pare with the results. A differentiated Gaussian pulse with o = 1 x 107! was 
used; as in Chapter 2, m = 40 was used. This produces a wideband pulse with 
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Figure 10.13 The differentiated Gaussian having propagated through a free-space “box” 
meshed using tetrahedral Whitney finite elements. At = 20 ps. 


significant spectral content to around 3 GHz. A cuboidal free-space volume 
0.1 x 0.1 x 0.2 m? was meshed using tetrahedral elements; some 804 elements 
produced a mesh with an average edge length of about 0.0285 m. The plane wave 
was injected traveling in the —Z-direction. The result in Fig. 10.13 shows the plane 
wave at three points in the mesh. First z = 0.19 m is illuminated, then z = 0.1 m 
and finally z = 0.01 m. Measuring the distance between the first and last peaks 
shows a delay of 0.60 ps (within the accuracy with which the graph can be read); to 
cover a distance of 0.18 m at the speed of light takes 0.6 ps, so this is very accurate. 
In particular, considering the coarse mesh, the result is really surprisingly good; at 
3 GHz the mesh density is less than four unknowns per wavelength. (At the center 
frequency of the signal, around 1.5 GHz, there are around seven, somewhat better 
but still a very coarse discretization using Whitney (CT/LN) elements.) Inciden- 
tally, the pulse may appear to have undergone a 180° phase reversal, but this was 
simply due to a coding convention. The late time signal is very likely a reflection 
from the absorbing boundary condition; the value is around 1/20 of the incident 
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Figure 10.14 As for Fig. 10.13, but with At = 50 ps. 


signal, or —26 dB, not by any means excellent absorber performance, but not out 
of line with what is expected from a first-order ABC. 

Figures 10.14 and 10.15 show the results for At = 50 ps and 100 ps respec- 
tively. Clearly, the result in Fig. 10.15 is very poor, but it is still stable, and what 
is significant is the size of At. For a similar FDTD mesh with spatial step size 
0.0285 m, the Courant limit would require At < 54.8 ps. The FETD code has re- 
mained stable at almost twice this limit. (Theoretically of course there is no limit 
for the Newmark-f scheme, but it is gratifying to have this confirmed by numerical 
experimentation.) 


10.7.3 The FDTD method as a special case of the FETD 


If the parameter f is set to zero, the Newmark algorithm reduces to the central 
difference algorithm. If, furthermore, we use Galerkin’s method applied to edge 
elements defined on cubes rather than tetrahedra, and use 3D trapezoidal inte- 
gration (i.e. sample the unknown function only at the center of each side when 
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Figure 10.15 As for Fig. 10.13, but with At = 100 ps. 


integrating), the standard Yee FDTD algorithm emerges. This may initially be a 
suprising result, since the FDTD appears to start from a different premise to the 
FETD, but has been noted by a number of workers. In the language of structural 
mechanics, this is a “lumping” method, where the mass and stiffness matrices are 
reduced to only diagonal elements. This of course implies that the matrix solution 
is trivial, which is why the FDTD method apparently has no matrix associated with 
it, and hence the explicit nature of the method. For more details, see, for example, 
[26]. 


10.8 Sparse matrix solvers 


The development of an FEM code often goes through two major stages: the first 
concentrates on getting the code to work; the second concentrates on optimizing 
the code with regard to both memory usage and run-time. In Chapter 9, for in- 
stance, we focussed exclusively on the former. This process is frequently iterative, 
since new theoretical extensions must again be validated first, and then optimized. 
Furthermore, certain validation can only be undertaken once some optimization is 
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already in place. Since the finite element system matrices are usually highly sparse 
(i.e. have a very large number of zero entries), the efficiency of the sparse solver(s) 
is probably the single most important factor in determining the overall efficiency 
on an FEM code, since the matrix solution time usually dominates all other con- 
tributors to the total run-time, and FEM codes cannot work efficiently unless the 
sparsity of the finite element system matrix is properly exploited. There are two 
choices to make when exploiting sparsity. 


Iterative solvers Iterative matrix solvers have the major advantage of requiring 
no additional memory beyond that required to store the coefficient matrices. They 
have the major disadvantage that each new solution of the system requires the 
iterative process to be repeated from scratch. 


Direct solvers These are usually variations on the LU decomposition theme, and 
factor the matrix into a lower (and an upper, if the matrix is not symmetric) trian- 
gular matrix which permits very rapid subsequent solution of the system. However, 
they have the major disadvantage that the factorization process generates a number 
of non-zero entries in the matrix; this is known as “fill-in”? Various methods are 
used to handle this; here, a method called “skyline storage” will be used. 


Which choice is best is in general problem dependent; surprisingly, even in the 
case of a finite element time domain solver, where the same matrix is involved at 
each time-step, a direct solver is not necessarily the best solution. (In that specific 
case, the real valued system generated appears to be well conditioned, resulting 
in very rapid convergence of the iterative process. Dibben and Metaxas reported 
this in some of the earlier work in the field [28].) The memory overhead of the 
profiled storage scheme can also be prohibitive. For frequency domain solvers, the 
complex valued matrix can become very ill conditioned, and generally some form 
of preconditioning is required if an iterative solver is used. 
First, two methods for storing a sparse matrix will be discussed. 


10.8.1 Profile-in skyline storage 


Consider the symmetrical matrix [A]: 


a1 412 
a21 422 a24 425 
[A] = rn rr (10.49) 


| a4 O a4 =| 
a52 0 0 a55 


with aj2 = a2), etc. Here, an observation will be made. If this matrix is factored, 
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without pivoting, then possible!! fill-ins will occur in [L] to the right of the first 
non-zero entry in a row across to the diagonal (and similarly, in [U] under the 
first non-zero entry in a column down to the diagonal). Hence, if all the zeros 
indicated above are stored, the factored matrix is guaranteed to fit into the data 
structure. This type of data structure is called a “skyline” matrix. There are several 
methods for storing the data: the one adopted here is called “profile-in,’ and what 
is stored is the elements in each row (column) from the first non-zero element to 
the diagonal (hence “in,” since one moves inwards to the diagonal). Additionally, 
the index of the diagonal element is stored. For this matrix, the profile-in storage 
looks as follows: 


AL = [a}1, 421, 422, 433, 442, 0, a44, a52, 0, 0, a55] 
ITALDIAG = [1, 3,4, 7, 11] (10.50) 


Since the matrix is symmetric, these structures could equally have been AU and 
IAUDIAG. The dimension of AL DIAG is n. The dimension of AL is at least 
nzs, the number of non-zeros in the lower (or upper) triangular half. Unfortunately, 
it is frequently many times this number. 


10.8.2 Compressed row storage 


Skyline storage is convenient when factoring a matrix but has a very high over- 
head, which only becomes clear when much larger finite element matrices are 
considered. The percentage of non-zero elements rapidly drops under one percent, 
but the profiled storage results in a very large number of zeros being stored, fre- 
quently an appreciable fraction of the original matrix. For iterative solvers, which 
require only a matrix-vector product, a much more efficient scheme is compressed 
row storage (CRS). Here, absolutely only the non-zero elements are stored. Since 
the storage requirements of a CRS matrix are so small, it is convenient to store 
each row completely, even if the matrix is symmetrical — this makes the sparse 
matrix-vector product far easier to write. In addition to an array storing the non- 
zero matrix elements, two other pointer arrays are needed. One stores the starting 
index of each row, the other stores the column indices. For the above matrix, the 
CRS equivalent is: 


A_-CRS = [a11, 412, 421, 422, 424, 425, 433, 442, 444, 452, a55] 
JA =[l, 2,1, 2,4, 5,3, 2,4, 2, 5] (10.51) 
ITA =[l, 3,7, 8, 10, 12] (10.52) 


'1 Not all these positions will indeed be filled. More sophisticated methods do a better job of this process. 
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The n + l-element of JA is nz +1, where nz is the number of non-zeros. The 
dimension of JA is n + 1, and the dimensions of both A_CRS and JA are nnz. 
These are known a priori, as soon as the matrix entries are known. 

This storage scheme is also known as “general storage by rows.” 

A very similar storage scheme (and the one implemented in MATLAB) is com- 
pressed column storage; the procedure simply interchanges the storage direction. 
Since finite element matrices are generally symmetrical (unless one is dealing with 
non-reciprocal materials) the schemes are in practice essentially identical for finite 
element applications. 


10.8.3 Implementation of matrix solution using these storage schemes 


Sparse matrices are important for two reasons: firstly, to save memory, and sec- 
ondly, to reduce run-time. Unfortunately, at the time of writing there is no analogy 
of the excellent public domain LAPACK routines for sparse matrices. If work- 
ing with languages such as FORTRAN 90, sparse libraries may be available, either 
bundled with the compiler or for purchase separately.!* However, actually storing 
the matrix in sparse form is a complex book-keeping task; one has firstly to estab- 
lish the connections between all the degrees of freedom present (and this becomes 
increasingly more complex as higher-order elements are added) to determine the 
number of non-zero entries, following which the compressed matrices may then 
be filled as the matrix is assembled. Alternatively, and rather more easily, a full 
matrix may be generated first, and a sparse matrix then generated from this — the 
MATLAB function sparse does precisely this. However, the requirement to store 
the full matrix first wastes large amounts of memory, and is not practical for FEM 
codes designed for electromagnetically large problems. 

It should be mentioned that especially higher-order elements appear to generate 
ill-conditioned matrices. When using iterative methods, such as conjugate gradient 
schemes (CG, Bi-CG), QMR and GMRES, convergence tends to be erratic. (For a 
description and discussion of these algorithms, see [14].) Some recent approaches 
have focussed on the use of more sophisticated preconditioners. Incomplete LU 
preconditioning is one possibility; another is the use of a direct solution of the 
CT/LN solution (which can generally be computed quite cheaply) as a precon- 
ditioner for the LT/QN matrix. This has been extended to higher-order schemes 
by [6]. Most of the more sophisticated preconditioner schemes trade off quicker 
convergence for increased matrix storage requirements. 

Direct solvers have a place; generally, ill-conditioning is far less problematic, 
but the fill-ins can result in very large matrices indeed. Renumbering schemes 


12 As an example, the Compaq Visual Fortran (previously Digital Fortran) Fortran 90/95 compiler includes a 
library package called Compaq Extended Maths Libray (CXML). Included are direct and iterative solvers. 
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Figure 10.16 Solver times for the cubic example given in Section 10.7.2. Solve times are 
for 100 time-steps. The CG solver normalized residual target was | x 107°. 


can ameliorate this, but unfortunately 3D finite element meshes tend to generate 
meshes with significant “bandwidth.” 


10.8.4 Results for sparse storage schemes 


Some results illustrating the impact of sparse matrix solvers on an FEM code — 
in this case the FETD implementation by the author discussed in Section 10.7 — 
are shown in Figs. 10.16 and 10.17. The times shown in Fig. 10.16 compare the 
time using the sparse skyline or iterative CG solver (using CRS) with those of a 
full matrix solver (the latter not exploiting symmetry, i.e. worst case). Similarly, 
the memory percentages shown in Fig. 10.17 compare the relevant storage scheme 
with a full matrix scheme not using symmetry. The skyline storage is actually 
considerably less efficient than might be inferred from Fig. 10.17. Because with 
either of the sparse schemes, the [B] and [C] matrices in Eq. (10.47) can be (and 
are) stored in the much more efficient CRS form, whereas in a full matrix scheme 
they are of course stored as full matrices, there is already a saving by a factor of 
very close to three which is reflected in this figure. (CRS stored matrices require 
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Figure 10.17 Memory usage for the cubic example given in Section 10.7.2. This is ex- 
pressed as a percentage of the memory required by a full matrix scheme making no use of 
symmetry. 


negligible storage compared to even skyline schemes, hence the factor of approxi- 
mately three.) 

These results are significant for code developers. Firstly, the timing results in 
Fig. 10.16 indicate that any sparse scheme is significantly better than none, as 
would be expected. Another interesting result is the comparison of the run-time 
of the CG solver with the skyline solver (lower left-hand plot in Fig. 10.16). The 
reason is that, for this problem at least, the number of iterations is almost con- 
stant, irrespective of problem size. This is shown in the lower right-hand graph 
in Fig. 10.16. (Although not shown on the figures, the number of iterations re- 
quired also did not change from time-step to time-step.) One must be cautious 
of extrapolating this result to electromagnetically larger and more complex prob- 
lems. These results were generated by increasingly refining the same problem. It 
is well known that the convergence rate of iterative solvers is a function of the 
ratio of maximum to minimum eigenvalues; furthermore, for any given electro- 
magnetic problem, discretization beyond a certain point does not yield more sig- 
nificant eigenvalues. Electromagnetically larger problems may of course contain 
a wider eigenvalue spectrum. This note of caution notwithstanding, the results for 
the iterative solvers are highly encouraging, since no effort was made with these 
results to increase the rate of convergence, and an entire class of methods using 
various preconditioners exists which can still be applied. The memory savings of 
the iterative solver are of course very impressive (right-hand graph in Fig. 10.17) 
and imply that the limit on large problems is more likely to be run-time than 
memory. 

A final comment on these results. The graphs comparing memory savings are 
actually in terms of memory /ocations required, rather than actual Mbytes of RAM 
used. The CXML libraries use double precision, so in RAM, the percentages are 
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twice that shown in the graphs. (Double precision was presumably used since the 
sparse factorizer does not apply pivoting. All the results were tested against a full 
matrix direct solver, and results were generally identical within working preci- 
sion, around 4—5 significant figures after 100 time-steps. The CG solver normal- 
ized residual was set as 1 x 10~> to ensure that inaccuracies did not accumu- 
late during time-stepping.) This is a peculiarity of the particular implementation 
rather than the method per se. It also means that the computation times using the 
sparse schemes are slightly longer than would be the case if single precision were 
used. 


10.9 A posteriori error estimation and adaptive meshing 


As a final topic, some interesting recent work by Botha on the problem of esti- 
mating errors in the finite element solution will be outlined [12]. One of the main 
advantages of the FEM over the FDTD is that, theoretically at least, it is easy to 
refine a finite element mesh selectively. This can either be done by increasing the 
element order (p adaptation), decreasing the element size (h adaptation), or doing 
both (h — p adaptation). In practice of course, mesh refinement does bring some 
complexity. 

However, before one can undertake any form of mesh refinement, one needs an 
idea of in which part of the mesh the greatest benefits will be obtained. (Simply 
refining the entire mesh is of course a valid process, but computationally expen- 
sive. This is sometimes known as uniform mesh refinement.) It is here that the 
complex topic of error estimation comes to the fore. Here, one needs firstly to dis- 
tinguish between a priori and a posteriori error estimates. The former are derived 
theoretically, and do not use the specific geometrical data represented by the mesh; 
examples are the analysis of dispersion error in a finite element or finite difference 
mesh. The latter are derived from the approximate solution, and it is these that will 
be considered here. 

A posteriori error estimates can themselves be categorized as follows. 


Explicit, residual-based These estimators are usually rigorously derived in the 
sense that the sum of the errors in each element is an upper bound on the error. 
(Here, we assume some suitable norm is available; often, the energy norm, dis- 
cussed subsequently, is used.) Typically, field discontinuities at element edges and 
faces are evaluated. 


Implicit, residual-based These estimators are based on the solution of local vari- 
ational boundary value problems, usually on an element-wise basis. Usually, an 
estimate of the error is made using additional basis functions of higher order than 
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the initial solution. Since this is done on an element-by-element basis, this is not 
prohibitively expensive computationally — certainly not when compared to uniform 
refinement. 


Estimation through post-processing These methods estimate the error in a deriva- 
tive of the solution field, by comparing it with an improved version. Although this 
may seem counter-intuitive, some methods are available for computing improved 
versions of the solution field and its derivatives. 


Targeted quantities These are also known as goal-oriented or targeted error esti- 
mation. They attempt to bound the error of a quantity based on some functional 
output of the the solution field. An example is the S-parameters discussed in the 
context of the waveguide formulation. 


Botha’s work focussed on explicit and implicit residual-based methods; the best 
results in general were obtained with the former, and a very brief summary of the 
method will now be presented. 


10.9.1 Explicit, residual-based error estimators 


Firstly, one must define the error in the solution as 

é, = E— Eh (10.53) 
where E is the (usually unknown) exact solution of the problem, and Ey is the 
approximate, finite element computed solution. Botha showed that an estimate of 
the error in the CT/LN solution may be obtained as 


N 


Wall eecyjrty SCD, | HM Rollfacg) +95 D> AFR GliZr¢K,) | 0.54) 
i=1 f CK; 


N is the number of elements in the mesh; Tt refers to the current discretization and 
solution, which will be used to compute the error indicators. The constant C is 
in general unknown, but is independent of solution field and source terms; error 
estimates usually contain such constants. 

The term [Ry | lees (Ki) is the volume residual on element i, with volume K;, mea- 
sured in the L? norm — the space of square integrable functions. The volume resid- 
ual in element i is computed from 


> 1 > > > 
Ry = -V x —V x En + kjerEn — jkoZoJ in K; (10.55) 


r 
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In other words, this is the difference between the finite element computed so- 
lution, and the specified impressed current — in short, the residual of the vector 
wave equation. (If the latter is zero, then this term should of course be zero.) Were 
the solution exact, then this residual would be zero throughout the finite element 
volume. 

The face residual on the surfaces of element i is computed from 


=. 


BS on 1 ay A = (2) 
Rf =n2) x wD Y X Fa = py V x EY on fin; m= 1,4 (10.56) 
Mr Ur 


Fm is a specific face of the element, and the superscripts (1) and (2) indicate the 
elements shared by a particular face. In other words, this is the discontinuity in 
tangential magnetic field on the inter-element boundaries; again, were the solution 
exact, then this residual would be zero at all inter-element boundaries. Note that a 
special treatment, not shown here, is required at the Neumann boundary. 

Whilst it may seem obvious that such residuals provide an indication of the error 
in the solution, some subtle mathematical arguments are required to show that the 
sum of residuals in Eq. (10.54) does indeed produce a bounded estimate of the 
overall error; the details may be found in [12, Chapter 5]. 

It should also be commented that the “norm” on é, on the left-hand side of 
Eq. (10.54) is not a proper norm of the error field, but rather an approximate energy 
norm. (The reason that this does not conform to the usual definition is that this 
energy norm can be zero, without the field being zero. However, the converse is 
indeed true, i.e. the energy norm of a zero-valued field is zero.) The reason that this 
needs to be introduced is rooted in the complex valued nature of the functional. The 
approximate energy norm for space of maximum (but not necessarily complete) 
polynomial order p is defined as 


Sly EV x t-V xt — Keb -ddVv| 
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The term in the denominator, hear (K,,))3? represents the vector Sobolev semi- 


norm of derivative order p on domain K,,. Details of its evaluation may be found 
in [12]. 


10.9.2 An example of the application of an error estimator 


An insightful example of the application of an error estimator may be found in 
[29]. The problem is an X-band waveguide filter (Fig. 10.18), with three metal 
septa along its center, normal to the broad walls of the waveguide (we have already 
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Figure 10.18 The waveguide filter geometry. After [29], ©2002 IEEE, reprinted with 
permission. 


Figure 10.19 The waveguide filter: 2.5% elements with highest indicated error. After [29], 
©2002 IEEE, reprinted with permission. 


encountered this problem in Chapter 3). The explicit residual-based error indicator 
was applied, and results were obtained indicating where the computed errors were 
the highest. These are shown in Figs. 10.19-10.22. As expected, the errors cluster 
around the edges of the septa. 

Once one has an indication of where the errors are most serious, one has various 
options to improve the solution. In this case, the results were used to drive a p- 
adaptive scheme, using the hierarchal elements of both mixed and complete order 
discussed earlier in this chapter. This permits a variety of possible schemes. The 
original solution was obtained with CT/LN elements; one possibility is to upgrade 
all the elements with the highest indicated error to QT/QN (which was the highest 
order available within the code); another is to upgrade to LT/QN elements; and a 
final possibility is a graded scheme, whereby the third of the elements with the 
highest errors are upgraded to QT/QN, then the next third to LT/QN and the last 
third to LT/LN. Results are shown in Fig. 10.23. The percentage error in center 
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Figure 10.20 The waveguide filter: 5% elements with highest indicated error. After [29], 
©2002 IEEE, reprinted with permission. 


Figure 10.21 The waveguide filter: 10% elements with highest indicated error. After [29], 
©2002 IEEE, reprinted with permission. 


frequency is plotted against the number of degrees of freedom, which obviously 
grows as more and more elements are refined. Interestingly, the performance of 
the QT/QN and LT/QN schemes was similar, but the graded scheme was not very 
successful, primarily due to the inclusion of the LT/LN elements. It should be em- 
phasized that this particular graded scheme is an heuristic one, and others could of 
course be proposed. These elements do not appear to be very beneficial in waveg- 
uide finite element analysis, a phenomenon noted and discussed in [4]. 
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Figure 10.22 The waveguide filter: 20% elements with highest indicated error. After [29], 
©2002 IEEE, reprinted with permission. 
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Figure 10.23 Waveguide filter center frequency versus number of degrees of freedom. 
A comparison of three upgrading schemes. After [29], ©2002 IEEE, reprinted with 
permission. 
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10.10 Further reading and conclusions 


Most of the references cited in Section 9.9 are also of course relevant here. Jin’s 
second edition [14] is probably the best single-volume reference in this regard, and 
includes a chapter devoted to time domain FEA and another to matrix solution. 

In the context of higher-order vector elements, it should be noted that there 
is another school of thought regarding the construction of higher-order basis 
functions, which might be described as the degree of freedom centered approach 
(as opposed to that given in this chapter, which could be described as the ba- 
sis function centered approach). Salazar-Palma et al. [30] use elements from the 
Nedelec polynomial space and enforce Lagrangian interpolatory properties on the 
degrees of freedom. This produces interpolatory elements with well-defined de- 
grees of freedom at points, but at the time of writing, no-one had yet succeeded 
in doing this in general with higher-order hierarchal elements. Yioultsis and Tsi- 
boukis take a similar degree of freedom centered approach, but working with sim- 
plex instead of Cartesian coordinates [31]. 

The work of Hiptmair should also be mentioned; he has also recently published 
a general scheme for the construction of higher-order elements, but from a far more 
mathematical viewpoint, and couched in the language of differential forms [32]. 

An important topic which we have not discussed is curvilinear elements. Whilst 
higher-order elements can do an excellent job of representing the fields very 
accurately, the limitations imposed by straight-sided triangular or tetrahderal el- 
ements in terms of accurate modelling of curved geometries can be very signifi- 
cant for many practical problems. There are in essence two questions to answer 
here: firstly, given a geometrical transformation, how does one implement this as a 
curvilinear element, and secondly, what transformation should be used. The former 
is the more theoretical issue, the latter a more practical one. Strangely, although 
curvilinear elements have been used in CEM, the literature on this is rather in- 
complete, in particular in the context of vector elements. The following references 
either deal with the issue in passing, touch on the issue, or summarize some aspects 
thereof [5, 33, 34, 35]. In the context of nodal elements, the discussion in [1, Chap- 
ter 7] is also useful. Recent work by Marais is amongst the more comprehensive 
treatments, although limited to two-dimensional problems [36]. 

Although an obvious application of the FEM, discontinuities in rectangular 
waveguides have not been as widely addressed in the literature as one might ex- 
pect. Ise et al. [37] used “brick” elements of “first” order (CT/LN) to analyze both 
a dielectric post and a concentric step discontinuity in a rectangular waveguide; Jin 
presented a detailed formulation in [14, Chapter 8], also using CT/LN elements; 
Webb’s review paper discussed a number of related issues [38]; and Pekel and Lee 
addressed theoretical aspects of mesh refinement using an empty piece of waveg- 
uides [39]. Scott addressed rotationally symmetric waveguides and obtained very 
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good results using special purpose higher-order elements [40]. The present author 
studied LT/QN elements in [15], and then considered the use of both mixed- and 
full-order elements in [4]. 

Ferrari has recently published a new formulation for the analysis of scattering 
discontinuities in waveguides, using an extended Huygens’ principle [41]. The 
scatterer is discretized using finite elements, and the waveguide Green functions 
are used on the boundary of the scatterer, so this is a type of FEM/MoM hybrid. 
Geschke et al. reported the first successful implementation of the formulation in 
[42]; the details and many additional examples may be found in [43]. 

Regarding the FEM/MoM hybrid formulation, Peterson ef al. [20] proposed that 
FEM/MoM hybrids be classified as either outward-looking or inward-looking. In 
the former case, the surface integral formulation represented by the MoM is used to 
augment the variational functional form of the vector wave equation, and this was 
the approach used in this chapter. It is also the most commonly encountered in the 
literature and in practice, since the effect is to increase the size of the FEM matrix 
somewhat. Furthermore, this outward-looking approach is also readily amenable 
to the introduction of approximate radiation boundary conditions, such as absorb- 
ing boundary conditions, rather than the rigorous Green function approach im- 
plicit in the MoM. Inward-looking formulations use the interior problem to con- 
strain equivalent sources on the bounding surface. In this case, a large FEM matrix 
must first be solved before a smaller dense matrix can be constructed. Examples 
of the latter approach are the unimoment method, first suggested by Mei in 1974. 
More details on this topic may be found in [20, Chapter 3]. One problem with the 
outward-looking approach outlined here is that the matrix symmetry is generally 
destroyed. Botha and Jin have recently proposed a formulation which hybridizes 
the FEM and MoM on the formulation level (rather than on the matrix level) 
and which preserves the matrix symmetry [44]. It also offers some alternative ap- 
proaches, including one which uses both E and H as working variable, permitting 
both fields to be computed to the same level of accuracy. The FEM/MoM formula- 
tion given in this chapter can suffer from internal resonances; the problem comes 
from the MoM treatment, and has already been mentioned in Section 6.9. The usual 
solution is to combine the EFTE and MFIE on the boundary. The formulation of 
Botha and Jin apparently also solves the problem, and computed results support the 
claim [44]. 

In terms of time domain formulations and applications, the paper by Gedney and 
Navsariwala [45] is one of the earlier in the field to discuss the FETD. It discusses 
an unconditionally stable formulation using the Newmark-8 method. Although 
brief, it discusses most of the important topics and provides a stability analysis. 
The formulation is similar to that presented recently in [14]. The paper by Dibben 
and Metaxas [28] is also one of the earlier publications, and also uses the Newmark 
method. Together, these two represent well some of the earlier work on FETD 
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formulations and implementations within CEM. The review paper by Lee et al. 
[46] appeared in a special issue of the JEEE Transactions on Antennas and Propa- 
gation on numerical methods some years back, but presents a very good overview 
of the state of the art then — which it should be commented does not appear to 
have advanced enormously since, with the exception of boundary conditions. It 
presents an elegant theoretical framework for the general class of FETD meth- 
ods, and is more general than the approach presented in [14], which focusses on 
the conventional curl-curl functional formulation. One very troublesome problem 
with FETD methods has been the development of efficient ABC-based boundary 
conditions; the paper by Jiao et al. [47] appears to have been the first to report a 
rigorous PML-type implementation for the FETD, although several workers, in- 
cluding the present author, have encountered problems with this implementation, 
in particular regarding stability. 

Error estimation and mesh adaptation has a rather small bibliography in the 
engineering electromagnetics literature. Earlier work on this was done by Meyer, 
in the context of scalar, two-dimensional electromagnetic scattering and radiation 
problems, and results may be found in [48] and [49]. His results remain one of the 
most complete investigations of that specific problem. Some of the earlier work on 
the three-dimensional vector problem was done by Pekel and Lee [39]. 

In a field as large as finite elements, it is inevitable that there will be some 
important topics which we have not discussed at all. One which has produced 
important and interesting results is the analysis of dispersion error in finite element 
meshes (this is also sometimes called pollution error). The work of Cangellaris 
and Lee is an important reference here [50]; an overview of more recent work may 
be found in [14]. Also, modelling microwave ovens for commercial electro-heat 
applications has been a significant radio-frequency application of the FEM, using 
both eigenvalue and driven problem analysis. Details may be found in the books 
by Metaxas [51] and Chan and Reader [52]. 

Finally, serious students of the FEM who would like to read the large applied 
mechanics and applied mathematics literature will find that much of it uses the 
language of functional analysis. A very readable introduction is the text by Reddy 
[53], not least since it focusses on FEM formulations, unlike many of the more 
general texts on functional analysis. 
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Appendix A 
The Whitney element 


The Whitney form A; VA; — 4; VA; is so widely used in vector elements that it is worth 
discussing in more detail. The development here is for two-dimensional elements, which 
has the benefit of simplicity; however, the essential argument is the same for the 
three-dimensional case. 

Firstly, note the following very important property of the gradient of a simplex 
coordinate: it is constant, and is directed perpendicular to an edge. As an example, for the 
triangle shown in Fig. Al, VA is perpendicular to edge 1, opposite vertex (node) 1. The 
formula is 


VA; = Nj; (A.1) 


with A the area of the triangle and 7; the normal on edge i. 
We now investigate the properties of an approximation using the Whitney basis 
functions 


Ew E3(A, VA2 — A2VA1) + E2(At VA3 — A3VAq) + Ey A2VA3 — A3VA2) (A.2) 


where E, E> and E3 are constants whose physical meaning will shortly become clear. 
We consider edge 3; anywhere on edge 3, A3 = 0, and VA3 is perpendicular to it. 
Finding the tangential component of the field on edge 3, we obtain: 


é3-E = E3(4iVA2 — A2VA1) + Eo -0+ £-0 


> 


= Etan| 


Lae: (A.3) 
where the second and third terms are zero due either to 43 = 0 on edge 3 or VA3 being 
perpendicular to this edge. 

Using the sin rule for triangles (that the ratios of edge lengths and sines of opposite 


angles are equal) and Eq. (A.1), and the geometrical meaning of the dot product, we 
find 


Be ee a ee er 
= £3 1542 63 2 nN) + €3 


Ee Zt 


edge; 


1 
= E35, Gila sin, + A] sin 62) 
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edge 1 
Figure Al VA}. 


1 
= B3 mA sin 01 (A, + A2) 


1 
= E3 ee, sin cal (A.4) 


where the identity 41 + A2 + A3 = | has been used in the last line (noting that 43 = 0 on 
this edge). sin 6; is the included angle at vertex i. Clearly, this is constant; now it is clear 
that £3 is the tangential field on edge 3. Similar results follow for EF; and edge 1, and E2 
and edge 2. 

It can be simplified further by noting (from Fig. A1) that £2 sin 6; is just the height of 
the triangle above edge 3. Since the area of the triangle is 
A = (1/2)¢h = (1/2) £322 sin @, it follows that 


E3 


Etan | edge, — £3 


(A.5) 


This is a well-known result, derived independently here. The general form for edge i is: 


Ej 
Evanleage, = rh (A.6) 


> 


If the vector basis function includes the edge length, as some published versions have, 
then the result is 


Evan\oa,.. = Ei (A.7) 


edge; 


Appendix B 
The Newmark-f time-stepping algorithm 


The Newmark-f algorithm is rather more challenging to derive than is generally indicated 
in the literature, and its derivation is worth outlining. Most references cite the original 
paper by Newmark [1], which perhaps surprisingly does not derive the recurrence 
relationship, Eq. (10.45), which is generally associated with the name. This recurrence 
relationship was first given in a much later and very important paper by Zienkiewicz [2] 
published almost twenty years after the original method appeared. It is worth outlining the 
formulation, since it underlies the time-stepping approach implemented and does not 
appear to be available anywhere apart from Zienkiewicz’s paper, which can be difficult to 
obtain. 

The method is only relevant to the following differential equation representing a 
general second-order system with damping: 


Mi+Ci+Kx+f=0 (B.1) 


It was derived for structural mechanics, where x is the displacement,! and x and x the 
velocity and acceleration respectively. It is also based on a Taylor series expansion. For 
discrete samples at t = nAt and t = (n + 1)At, the Taylor series expansion of the first 
derivative is 


eee ee 
oO a al (B.2) 


Newmark proposed that for sufficiently smooth functions this can be evaluated as 
dnt = in +XAL (B.3) 


where x represents some value of x (in structural dynamics, the acceleration) intermediate 
between X, and X,+41. This is where the parameter y in Newmark’s scheme is introduced: 


inti = iy + (1 — yin At + ying At (B.4) 
Clearly, this is a second-order accurate scheme (for sufficiently smooth functions). The 


Newmark-f scheme uses y = 1/2, hence the approximation of the second differential 
places equal weight on the values at n and n + 1. The function itself (in structural 


! The extension to two and three dimensions is straightforward, x is replaced by x. 
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dynamics, the displacement) is approximated in a similar fashion, although in this case 
retaining an additional term in the Taylor series: 


Knit = Nab MpAt + A — 2B)K, At? /2 + DBF At? /2 (B.5) 


Note that this is not the time integral of the approximate velocity, but rather the 
expansion of the displacement. 

Most textbooks which discuss the technique indicate that by writing Eq. (B.1) at 
time-step n + | 


Mir Cie k ee 0 (B.6) 


and by also using Eqs. (B.4) and (B.5), one obtains values for x41, X41 and X,+1 in 
terms of X,, X, and X, and this is what Newmark implied in his original paper. This, 
however, is not the desired recurrence relation, Eq. (10.45). Zienkiewicz indicates the 
process required to obtain this. One writes the governing equation, Eq. (B.1), additionally 
at the time-steps n and n — 1; further, the integration formulas, Eqs. (B.4) and (B.5), are 
written at time-step n — 1, n. This provides seven equations in nine unknowns (three 
displacements, three velocities and three accelerations) from which all the velocities and 
accelerations can be eliminated to produce the conventional recurrence scheme: 


[ 4+ yAtC + BACK] 4 
1 
Ba |-2m + =2y)Atc + (5 sey 26) arn ip 


1 
+ | +(-1+ y)Ate + (5 = 26) arn Xn-1 


+ (BA?) fn41 
+(5+7-26) frat + (5-7 +26) fn-1At? =0 (B.7) 


The derivation as outlined above does not appear ever to have been published, only the 
results. 

Importantly, Zienkiewicz then proposed that this recurrence relation can alternatively 
be derived by applying a weighted residual process to Eq. (B.1). In addition to providing 
an independent check of Eq. (B.7), this procedure permits a far more general approach to 
the problem, and proceeds as follows. Firstly, x is approximated by the three-term 
expansion: 


x © 0 Nixi, i=n—1,n,n+1 (B.8) 
i 


Obviously, this will support a second-order expansion in time, as required by the 
second-order derivative in Eq. (B.1). The shape functions N; (which represent the 
temporal expansion functions) are the usual node-based quadratic functions and are given 
in detail in [2, Eq. (10)]. It is further assumed that x, and x,—1 are known, and that the 
only unknown is x,41. Hence only one weighting function is required. Replacing the 
interval [— Ar; At] with the normalized variable —1 < € = t/At < 1, Zienkiewicz shows 
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that if we identify 


1 1 1 
= d d = 
1 1 1 1 1 1 1 4 1 
p=5f weaseae/ [ war=5(y-3)+5/ we az / [was 


then we obtain Eq. (B.7). This is a very useful result, since it makes the approximations 
involved far clearer. It also permits us to extend the Newmark scheme if necessary. 
Zienkiewicz used the result to show how a variety of weighting functions yield different 
three “time-stations” time-stepping schemes, of which the Newmark scheme is the most 
general. For instance, with y = 5 and 6 = 0, the weighting function is a Dirac delta at 
t =n, and the central difference scheme results. The Newmark-6 scheme, on the limit of 
stability with y = 5 and B = ip corresponds to the “average acceleration” scheme and 
the weighting function is the linear function |&|, zero at the center of the interval (t = n) 
and unity at the ends of the interval (n — 1 and n + 1) [2, Fig. 1]. It is also possible to 
produce higher-order schemes. Using cubic functions, for instance, a third-order scheme 
can be derived with four time-stations and Zienkiewicz also outlines this. 
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Appendix C 
On the convergence of the MoM 


Throughout this book, checking convergence numerically has been continually 
emphasized. However, we have not discussed the more theoretical issues of whether the 
underlying numerical formulations are indeed convergent, in the sense that the 
approximate numerical solution f% of the continuous operator equation Lf = g has the 
property f% — f as N > oo. The aim of this appendix is to give a brief summary of the 
current status of this — which readers may be surprised to learn is far from a closed 
subject. 

With the FDTD, the Lax equivalence theorem (discussed in Chapter 2) provides us with 
confidence that refining the FDTD mesh will indeed result in a convergent solution. With 
the FEM, work in applied mechanics has provided a rich set of convergence results — 
although we should note that convergence for high-frequency electromagnetics problems 
is often in terms of the energy norm, as discussed in Chapter 10. This is a slightly weaker 
statement of convergence, since the energy norm does not satisfy all the properties of the 
norm. Also, these proofs are usually in terms of interpolation error; as has been noted, 
dispersion (or pollution) error is a different problem specific to the differential equation 
based solvers, but can usually be controlled by adequate meshing. (Integral equation 
formulations using exact Green functions do not suffer from this problem of cumulative 
error resulting from dispersion error [1, p. 200].) 

However, with the MoM, the problem has been studied somewhat less, presumably 
since the Green function is specific to electromagnetics. Rather surprisingly, only one 
form of operator has been rigorously shown to be convergent. (A recent summary may be 
found in [1, Chapter 5], which we summarize very briefly here.) This is the “identity plus 
compact” operator, of which the (two-dimensional) TE MFIE is an example. Proofs 
follow either via Galerkin’s method, or via degenerate kernel analysis. Other types of 
operators are “compact” (the TM EFIE) and “unbounded” (the TE EFIE) — for neither of 
these do rigorous convergence proofs currently exist. (Incidentally, this nomenclature 
derives from the behavior of the eigenvalues of the operators.) 

On the one hand, this is a somewhat disturbing situation, since important engineering 
designs are based on a field of mathematics which it transpires is far from complete. On 
the other hand, some forty years of development of the MoM has produced methods 
which have solved an enormous number of practical engineering problems with great 
accuracy, so it would appear most likely that what is missing is a convergence proof, 
rather than a fundamental problem. It would be satisfying were such proofs to be 
provided — or if they exist, popularized in the engineering literature. Here, we can but 
quote Peterson ef al. [1, p. 224]. 
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Our previous experience with integral equation formulations supports the notion that, if constructed 
with sufficient care, numerical solutions appear to converge under much more general conditions. 
Despite this observation, the authors are not aware of more general convergence proofs applicable 
to the specific integral operators arising in electromagnetic scattering. 


On the subject of convergence, another topic which has aroused controversy is whether 
the Galerkin formulation is superior to other forms of testing procedure. The controversy 
arose because the far-zone characteristics of the antenna or scatterer can be expressed as 
quadratic functionals of the surface current, which can sometimes be defined in such a 
way that they have a stationary point at the true solution. The work of Peterson et al. [1, 
Section 5.12] has shed new light on this matter: they have shown that provided the testing 
functions have similar accuracy properties as the basis functions, the overall error from 
either a true stationary functional (as can be obtained using a Galerkin procedure) or a 
general continuous functional form is of similar magnitude. They took this further, by 
numerical tests using high-order spline basis and testing functions; their results support 
the contention that the error is actually a function of the combined order of basis (P) and 
testing function (Q), and that a Galerkin solution with P = Q is no more accurate than a 
non-Galerkin solution with the same total P + Q. 
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Suggested exercises and assignments 


For graduate level courses, the following are suitable exercises. Most have been tested 
over the years by the author in a classroom environment. The approximate time required 
by a student to complete the assignment is also indicated, to assist in planning. This must 
be treated as only a guideline; it can change significantly, perhaps by as much as a factor 
of two either way, depending in particular on the programming ability of students or their 
familiarity with a particular code. The times given are for code development from scratch, 
and are based on the time the author and/or typical students have spent developing the 
routines or models; if some existing material is made available to the students, these can 
be greatly reduced. A number of MATLAB files, .pre files etc. are available to assist 
readers. 


Chapter 2 
ID FDTD analysis 


1. Write a program to implement the 1D FDTD analysis of a transmission line, as discussed in this 
chapter. In particular, repeat the results given for the single-frequency source (Fig. 2.6), and also 
for the wideband source (Figs. 2.20, 2.21 and 2.22). Also investigate the effects of other 
termination conditions, such as a matched load. [20 hours] 


Chapter 3 
2D FDTD analysis 


1. Repeat the TE; scattering analysis discussed in this chapter using longer (in time) pulses and 
shorter pulses. Explain the time domain results obtained with each of these. Keep the grid size at 
800 x 400 and M = 1024 so that run-times remain minutes rather than hours. [20-30 hours] 

2. Modify the code to compute TM, scattering from a cylinder. Does the TM; polarization also 
show creeping waves? [10 hours] 

3. Finally, extend the code (either TM or TE) to use the PML ABC. Since one needs to verify the 
PML, this is quite time consuming. [20-30 hours] 


3D FDTD analysis 


1. A ring hybrid, or rat-race, is a four-port device which functions as a 180° hybrid. (These are 
discussed in some detail in Section 10.4.) Descriptions may be found in many books on 
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microwave circuits, such as [1, Section 7.8]. Using a commercial package, predict the behavior 
of such a device fabricated in microstrip. [5S—10 hours, depending on the ease of use of the 
package, and also the student’s familiarity with it] 


Partial solution 


The device was designed to operate at 1.8 GHz. One must first obtain the dimensions for 
the microstrip; this was based on the example in [1, p. 163]. The substrate was chosen as 
d = 1.27 mm thick, €, = 2.2. For the 50 Q feedlines, the strip width to substrate 
thickness ratio W/d is 3.0981, hence W = 3.91 mm. For the 70.7 Q components in the 
rat-race, W/d works out as 1.768, hence W = 2.24 mm. The effective dielectric constant 
in the 70.7 Q section is 1.82 (it is slightly dependent on W/d). Hence, at 1.8 GHz, a 
quarter-wavelength in the dielectric is 30.8 mm. The average radius of the ring is thus 
29.4 mm. 

The four ports were modelled as discrete ports on the ends of sections of 50 Q feedline 
approximately 15 mm long. In this case, the standard planar coupler template was used, 
and the space on top is five times the substrate thickness, as recommended by a MWS 
tutorial. In this case, however, open boundaries should be used (apart obviously from the 
ground plane). 

At the design frequency, the results show the expected good match at port 1, 3 dB 
coupling to ports 2 and 3, and some 45 dB of isolation with respect to port 4 
(Fig. D1). 


Scattering pararmeter [dB] 


Freq [GHz] 


Figure D1 A MWS simulation of a rat-race hybrid coupler in microstrip. 


Suggested exercises and assignments 399 


Chapter 4 
ID MoM analysis 


1. Using the theory presented in Section 4.3, develop a thin-wire MoM code for a Z-directed dipole. 
Use sinusoidal weighting functions and collocation, so that Eq. (4.33) is applicable. Use both the 
delta-gap and magnetic frill source models, and replicate Fig. 4.3. [10 hours] 


Chapter 5 
Application of FEKO and NEC-2 


This chapter consists largely of material which lends itself to assigning as tasks, as well as 
simple variants on the designs presented. Most of these will take 5—10 hours if the FEKO 
or NEC-2 models are developed from scratch. If time is pressing, a good alternative is to 
make available an existing model and ask students to modify them for a different 
geometry, frequency range etc. 


Chapter 6 
2D MoM analysis and hybrid methods 


The material in this chapter does not readily lend itself to tasks. 


Chapter 7 


Sommerfeld potentials 


1. Develop a code to replicate the results in Figs. 7.4 and 7.9, and then Figs. 7.10, 7.11 and 7.12. 
[40 hours] 

2. Using this, develop an MoM code for a thin printed dipole and repeat the results of Fig. 7.13. [10 
hours] 

3. As an advanced task Instead of using the quasi-static approximation of Eq. (7.82), evaluate this 
rigorously as well. [Estimate 20 hours] 


Chapter 8 


Practical application of the Sommerfeld potentials 


Again, this chapter consists largely of material which lends itself to assigning as tasks, as 
well as simple variants on the designs presented. Similar comments apply as for 
Chapter 5. 


Chapter 9 


2D finite elements 


1. Using the theory developed in this chapter, develop a code to compute the TM eigenmodes. 
(Note that in this case, the problem is formulated in terms of the H field, and one uses the 
natural boundary condition on the waveguide walls.) [30-40 hours] 
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Chapter 10 


Advanced FEM topics 
As with Chapter 6, the material in this chapter does not readily lend itself to tasks. 


Reference 
[1] D. M. Pozar, Microwave Engineering. New York: Wiley, 2nd edn., 1998. 


Appendix E 


Useful formulas for simplex coordinates 


Basic properties 


On a triangle: 
On a tetrahedron: 


AytA2tA3+Ag= 1 (E.2) 


Integration 


Integration over a triangle: 


made Qi! jl k! 
// NAGA3 dS = ——_~____A (E.3) 
s (2+i+j+h)! 


A is the area of the triangle. 


Integration over a tetrahedron: 


peat 31 il fl RUD! 
if MAgeheaS = — V (E.4) 
Vv : 3+i+tj+k+D! 


V is the volume of the tetrahedron. 


Gradient 


Gradient on a triangle: 


VA; = Nj (E.5) 
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with A the area of the triangle, /; the length of edge i and n; the normal on edge i, 
pointing into the triangle. 


Gradient on a tetrahedron: 


as 


Aj 


Vip e 
3V 


(E.6) 


with V the volume of the tetrahedron and A; the area of face {j, k, /}, with normal 
pointing into the tetrahedron. 


Appendix F 


Web resources 


These sites, which include a number of commercial companies, were correct as of 2004 — 
web sites do change from time to time. This list is far from exhaustive, but gives a flavor 
of the variety of CEM products on offer, as well as the international technology base in 
this regard. 


Ansoft Corporation A Pittsburgh, USA based company specializing in commercial FEM 
code suites. 
URL: http: //www.ansoft.com/ 


Applied Computational Electromagnetics Society An organization supporting the 
development, validation, and distribution of numerical EM modelling codes. Presently 
hosted by the University of Mississippi. Contains a number of very useful CEM links, 
including links to the public domain code NEC-2. 

URL: http: //aces.ee.olemiss.edu/ 


Computer Simulation Technology Based in Darmstadt, Germany, this company 
specializes in commercial Finite Integration Technique (largely FDTD) code suites, in 
particular MWS. 

URL: http: //www.cst.de/ or 

http: //www.cst-world.com/ 


COMSOL A Swedish company, their main product is FEMLAB, a multi-physics FEM 
solver. 
URL: http: //www.comsol.se/ 


EMLIB This site, maintained at JPL, has been created for the free distribution of 
electromagnetics software and related information. This related information includes 
relevant conference information, a list of other EM sites, and a user-defined searchable 
directory of people working in the EM field. 

URL: http://emlib.jpl.nasa.gov/ 
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EMSS (Electromagnetic Software and Systems) Originally based in Stellenbosch, South 
Africa, this company now also has a German branch and US offices. Their main product 
is FEKO. They also provide a free GUI for NEC-2, Wiregrid for Windows. 

URL: http: //www.emss.co.za/ or 

http://www.feko.info/ 


MININEC website EM Scientific, Inc market a professional version of this code. 
URL: http: //www.emsci.com/ 


NEC-2 homepage An unofficial homepage with a number of links, as well as much of the 
NEC-2 documentation. 
URL: http: //www.nec2.org/ 


REMCON A US company, offering XFDTD, an FDTD-based package. 
URL: http://www. remcom. com/ 


Poynting Software Another South African company, based in Johannesburg, offering 
SuperNEC. 
URL: http://www. supernec.com/ 


The Schneider—Schlager FDTD database An exhaustive bibliography of published work 
dealing primarily with applications of, or extensions to, the FDTD method. 
URL: http://www. fdtd.org/ 


Zeland Software Based in California, their best known product is probably IE3D, a planar 
and 3D MoM simulation package. It is widely used for microstrip structure simulation. 
URL: http://www. zeland.com/ 


ABC 
alternate formulations for FDTD, 115 
complementary operator, 115 
FDTD, 77 
FDTD ID, 78 
FEM time domain, 386 
impact on 3D FDTD, 107 
Mur Ist and 2nd order, 79 
PML, see PML 1 
radiation vs absorbing BC, 77 
Absorbing Boundary Condition(s), see ABC 
accuracy of CEM techniques, 5 
effect of finite discretization, 18 
effect of finite machine precision, 19 
effect of finite problem size, 18 
effect of numerical approximation, 19 
active impedance, 275, 281 
Adaptive Integral Method, 23 
advective equation, 78 
analytical solutions 
on “exactness” thereof, 195 
asymptotics, 4 
importance of methods, 23 


barycentric coordinates, see simplex 
coordinates 

basis functions (MoM) 
entire domain, 261 
for EQS thin-wire problem, 121 
NEC, see NEC, basis functions 
piecewise linear, 136 
piecewise sinusoidal, 129 
various types, 121 

Boundary Element Method, 7 
relationship to MoM, 118, 142 

branch points and cuts, 249 


capacitive iris, 353 
CFIE, 186 
collocation, 123, 140, 141 
Combined Field Integral Equation, see CFIE 
commercial codes 
Ansys, 14 
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Ensemble, 8 

FDTD, 11-12, 107 

FEKO, see FEKO 

FEM, 14 

FEMLAB, 14 

GEMACS, 8 

general points about using, 108 

HESS, see HFSS 

IE3D, 8 

increasing use of, 25 

MoM, 8 

MWS, see MWS 

SuperNEC, 8, 147 

websites, 403-404 

XFDTD, 11, 107 
complex plane 

integration on, 248 
computational complexity, see operation count 
computational cost, see operation count 
computers 

performance, 24 
Courant limit, 11, 32 

2D, 73 

for FDTD BOR formulation, 115 

in 1D, 46 

limitations of, 46 

physical interpretation of, 46 

running close to, 65 

Von Neumann’s method, 46 


debugging 
coping with complexity, 267-268 
FDTD ABC’s, 86 
FDTD plane-wave source, 83 
FDTD update equations, 82 
deterministic problems 
FEM, 14, 345 
DFT, 58 
differential forms, 312 
differentiating vectors, 297 
dipole 
general modelling hints, 151 
Discrete Fourier Transform, see DFT 
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dispersion, 6 
accurate FDTD modelling of material properties, 
115 
dispersive materials, 11 
dispersive systems, 11 
effect on cumulative phase error, 93 
example of numerical, 85 
in FDTD simulations, 60-66 
in FEM meshes, 386 
magic time step, 65 
dispersion equation 
derivation of, 63 
Dyadic Green function, see Green function, dyadic 


edge elements, see FEM, vector elements 
edge-based elements, see FEM, vector elements 
EFIE, 184, 225 
Fredholm equation of first kind, 185 
interior resonance, 225 
eigenanalysis 
FEM, 14 
MoM, 14 
eigenproblem 
solution using LAPACK, 324 
solution using MATLAB, 324 
Electric Field Integral Equation, see EFIE 
electromagnetics 
history of, 3 
expansion functions, see basis functions 


Fast Fourier Transform, see FFT 
fast methods 
adaptive integral method, 217 
general, 184, 226 
k-space, 226 
misconceptions about iterative methods, 226 
Fast Multipole Method, 23, 184, 227 
Multilevel Fast Multipole Algorithm, 223 
three-dimensional formulation, 222—225 
two-dimensional prototype, 219-222 
FDTD 
accuracy, 43 
Alternating Direction Implicit algorithm, 115 
application to human exposure assessment, 363 
avoiding half-steps, 41 
Body of Revolution formulation, 115 
commercial codes, see commercial codes, FDTD 
comparison with FEM and MoM, 6 
computational molecule, 37 
consistency of method, 43 
Courant limit, see Courant limit 
FDTD as special case of time domain FEM, 371 
half-space step, 36 
half-time step, 36 
history, 32 
in one dimension, 29-67 
in three dimensions, 106-107 
in two dimensions, 69-93 
late time instabilities, 47 
near field to far field transformation, 115 
overview, 9-13 
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semi-implicit approximation, 40, 99 
spurious modes, absence of, 309 
stencil, 37 
strong and weak points, 12-13 
sub-cell models, see sub-cell models (FDTD) 
wideband sources, see wideband sources (FDTD) 
Yee algorithm, 32 
Yee algorithm, 2D, 71 
Yee algorithm, 3D, 106 

FEKO, 8 
adaptive frequency sampling, 164, 169 
application to antenna above reflector, 205 
application to dipole, 149 
application to helix antenna, 167 
application to log-p, 159 
application to microstrip patch, 273 
application to patch coupling, 273 
application to printed dipole, 266 
application to RCS of a dielectric sphere, 197 
application to RCS of PEC sphere, 190 
application to Wu-King loaded dipole, 175 
application to Yagi-Uda, 153 
conditional execution, 161 
convergence, 149 
different source models, 151 
FEKO Lite, 148 
ground plane, 168 
history, 148 
input file (. fek), 148 
iteration loops, 161 
label, 176 
loading, 176 
modelling spherical surface, 192 
planar substrate, 273 
PREFEKO file (. pre), 148 
radius vs. diameter, 159 
scaling, 158 
scripting language, 148 
source models, 151 
transmission line modelling, 161 
use of MoM/PO hybrid, 207 
use of RWG element, 187 
use of symmetry, 191, 206, 278 
use of volumetric currents, 197 
user-defined variables, 158 
wire to plate connection, 168 

FEM 
(dis)similarity with MoM, 292 
application to capacitive iris, 353-358 
application to Magic-T hybrid, 349-353 
application to waveguide discontinuities, 345-358 
book-keeping, 324 
boundary conditions, at material interfaces, 301 
boundary conditions, flux continuity, 302 
boundary conditions, practical handling, 299 
boundary conditions, specification of, 291, 305 
boundary conditions, summary of, 301 
commercial codes, see commercial codes, FEM 
comparison with FDTD and MoM, 6 
connection matrix, 295, 324 
Courant’s contribution, 289 


curvilinear elements, 384 

data structures, 298, 322 

edge numbering, 321, 330 

eigenanalysis, 318 

element connection, 295-299 

element shape, 293 

elements, 290 

error estimation and adpative meshing, 378-383, 
386 

face numbering, 330 

FDTD as special case of time domain FEM, 371 

formulation in three dimensions, 328-331 

formulation in two dimensions, 317-328 

free potentials, 297 

functional for eigenvalue problem, 305 

high-frequency variational functional, 305 

history, 289-290 

Lagrange multipliers, 331 

mass matrix, 302 

matrix assembly, 298 

matrix entries, explicit formula for, 318-321 

meshing, 323 

Method of Weighted residuals formulation, 291 

minimum of functional, 297 

Newmark-8 method, 369 

Newmark-f method, derivation of, 392-394 

Newmark-f method, unconditional stability of, 
369, 385 

node numbering, 321 

overview, 13-16 

post-processing, 325-326 

practical implementation in 3D, 332 

prescribed potentials, 297 

rationale for complete elements, 353 

rectangular elements, 293 

results of eigenanalysis, 326 

shape function, 293 

simplicial elements, 290, 303 

sparse solvers, see sparse solvers 

spurious modes, 306-309, 324, 331 

spurious modes, predicting number of, 324 

stiffness matrix, 295 

strong and weak points, 15 

strong form, 344 

time domain, 365-372, 385 

time domain ABC, 386 

time domain formulation, 367-368 

triangular elements, 293 

variational boundary value problem viewpoint, 
343-345 

variational functional for Poisson equation, 
299-301 

variational functional formulation, 293, 
299-301 

vector elements, see vector elements 

vector wave equation, kernel, 307 

vector wave equation, null-space, 307 

vector wave equation, solution of, 305 

waveguide formulation, 345-348 

waveguide formulation using Huygens’ principle, 
385 
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waveguide formulation, extracting S-parameters, 
348 
weak form, 344 
Whitney element, 313-317, 328-331, 390-391 
FEM/MoM hybrid 
application to human exposure assessment, 
363-365 
applications, general, 362 
inward-looking, 385 
outward-looking, 365, 385 
theory, 358-359 
FFT, 58 
description of algorithm, 218 
fast methods, 184, 216-219 
MATLAB implementation, 58 
Finite Difference Time Domain, see FDTD 
finite differences, 30-32 
backward differencing, 30 
central differencing, 30 
explicit methods, 32 
forward differencing, 30 
implicit methods, 32 
overview, 30 
Finite Element Method, see FEM 
finite integration technique, 11, 107 
equivalence with FDTD, 107 
Fourier transform, 36 
and spectral domain analysis, 233 
estimating, 57 
Fredholm integral equation, see EFTE and 
MFIE 
frequency scaling, see operation count 
frequency selective surface, 20, 115 
full-wave, 4-6 
extending limits, 22 
functional analysis 
and FEM, 386 
function, 186 
functional, 186 
Hilbert and Sobolev spaces, 141 
inner product, 141 
linear operator notation, 139, 186 
operator, 186 
symmetric product, 141 


gain 

dB vs. actual value, 167 
Galerkin 

and FEM, 291 

and MoM, 141 
generalised network parameters, 123 
Generalized Multipole Technique, 16 
geometrical optics, 4 
Green function, 7, 120, 231 

dyadic, 232-233 

free-space, 185 

static spectral domain, for microstrip, 

233-237 

Group Special Mobile, see GSM 
GSM, 363 

base station, 363 
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Hankel function 
evaluation in FORTRAN, 194 
evaluation in MATLAB, 194 
helix antenna 
axial mode, 167 
normal mode, 167 
HFSS, 14, 317 
application to Magic-T hybrid, 350 
using, 350 
High Performance Computing 
Amdahl’s law, 212 
efficiency (parallel processing), 211 
parallel processing, 184, 210-216, 226 
speed-up (parallel processing), 211 
transputer, 211 
homogeneous coordinates, see simplex coordinates 
hybrid 
approximate, 201 
exact, 201 
FEKO implementation of MoM/PO, 205 
general definition, 201 
MoM/PO, 184, 202, 226 
MoM/PO, mechanics of, 203—205 
Sommerfeld formulation, 201 
hybrid FEM/MoM, see FEM/MoM hybrid 


in place operation, 41 
incident field 
for thin-wire MoM, 130 
inner product, see functional analysis, inner product 
integral equation, 120 
forcing function, 120 
kernel, 120 


junction treatements 
NEC, 135 

junction treatments 
piecewise linear basis functions, 136 
piecewise sinusoidal basis functions, 139 


LAPACK, 267 
Laplace equation 
FEM solution of, 291 
Lax Equivalency Theorem, 44 
linear operator, see functional analysis, linear 
operator 
log-periodic antenna, 159 


Magic-T hybrid, 349 
Magnetic Field Integral Equation, see MFIE 
MATLAB 
efficient FDTD programming, 80 
frequently made errors, 81 
problems with indices, 71 
matrix equation solution, see solution of linear 
equations 
matrix inversion, see solution of linear equations 
Maxwell, 3 
Maxwell’s equations, | 
predictive power, 17 
memory requirements 
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2D FDTD, 92 
3D FDTD, 107 
surface MoM, 200 
thin-wire MoM, 200 
volumetric MoM, 200 
impact of sparse storage schemes for FEM, 376 
MoM Sommerfeld, 286 
mesh refinement, 25 
meshing 
FDTD stairstep approximation, 77, 91 
Method of Lines, 16 
Method of Moments, see MoM 
Method of Weighted Residuals, 7, 139 
equivalence with MoM, 118 
Method of Weighted residuals 
for FEM, 291 
MFIE, 185, 225 
Fredholm equation of second kind, 185 
interior resonance, 225 
microstrip, 231 
transmission line, 231 
microstrip patch 
FEKO simulation of, 273 
history, 271 
materials, 271 
mutual coupling, 273 
MWS simulation of, 112 
overview, 271 
microwave dielectric heating, 14, 386 
Microwave Studio, see MWS 
Mie scattering, see scattering from PEC sphere 
Mixed Potential Integral Equation, see MPIE 
Mobile telephony, 363 
modelling process 
accuracy, 17 
formulation simplications, 18 
manufacturing deviations, 18 
mathematical model limitations, 17 
tolerances, 17 
MoM 
commercial codes, see commercial codes, MoM 
comparison of source models, 151 
comparison with FEM and FDTD, 6 
convergence, 395-396 
delta-gap source model, 130 
electrodynamic example, 126 
electrostatic example, 119 
history, 118 
history of name, 142 
hybrid with FEM, see FEM/MoM hybrid 
in one dimension, 118-144 
magnetic frill source models, 130 
overview, 7-9 
stratified media, see MPIE, for stratified media 
strong and weak points, 8-9 
surface modelling, see surface modelling (MoM) 
thin-wire codes, see thin-wire codes 
volume modelling, see volume modelling (MoM) 
Moore’s Law, 5, 33 
MPIE, 189, 233, 246 
for stratified media, 244—246 


MoM formulation for printed dipole, 262-266 
results for printed dipole, 265 

Multi-physics, 15 

mutual coupling, see microstrip patch, mutual 

coupling, 275 

MWS, 11, 107 
advanced modelling features, 116 
application to microstrip patch antenna, 112 
application to rat race hybrid, 398 
application to waveguide “through”, 108 
application to waveguide filter, 110 
improving results using adaptive meshing, 111 
open boundary simulation, 114 
parametric modelling, 114 
Perfect Boundary Approximation, 112 


NEC, 8 
PL and PT cards, 164 
application to dipole, 149 
application to log-p, 159 
application to Yagi-Uda, 153 
basis functions, 134 
column spacing in input file, 147 
comma demarcated input file, 147 
control cards, 153 
geometry file (.nec), 146 
GUI, 147 
overview, 132 
radius vs. diameter, 159 
structural cards, 153 
tag, 176 
transmission line modelling, 161 
Wiregrid for Windows, 147, 157 
wiremesh ground plane, 175 
wires penetrating real ground, 269 
NEC2, see NEC 
NEC4, see NEC 
non-linear problems 
application of FDTD, 115 
Numerical Electromagnetics Code, see NEC 
Nyquist, 11 
effect on time step, 56 


Occam, 148 

operation count 
2D FDTD, 92 
3D FDTD, 107 
surface MoM, 200 
thin-wire MoM, 200 
volumetric MoM, 200 
FDTD, 12 
FEM, 15 
MoM, 9 
MoM Sommerfeld, 286 
prohibitive cost of large MoM problems, 208 
reducing FDTD, 42 


parallel processing, see High Performance 
Computing, parallel processing 

parametric modelling, 114 

partially filled cells, see sub-cell models (FDTD) 
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Perfectly Matched Layer, see PML 
periodic structures 
FDTD modelling of, 115 
phased array 
feeding of, 285 
phased arrays 
overview, 280 
scan blindness, 280-286 
physical optics, 4 
PML, 10, 33, 78 
corner regions, 99 
drawbacks, 104 
evaluation of, 103 
implementation issues, 101 
implementation of 2D split-field, 99 
polynomial grading, 102 
results, 103 
split field, 94, 97-99 
split field (in 2D), 95 
split field (in 3D), 95 
stretched coordinates, 95, 105 
summary of properties, 98 
uniaxial, 95, 105 
Pocklington 
historical background, 142 
integral equation, 118, 128, 184 
integral equation and NEC, 133 
point-matching, see collocation 
Poisson equation 
FEM solution of, 299 
potentials 
basics, 237-238 
Hertz, 238 
Lorenz gauge, 237 
principal value, 185 
printed antennas, see microstrip antenna 
printed dipole 
equivalence to wire dipole, 286 
MPIE solution of, 261 


quantum mechanics 
bra-cket notation, 140 

quasi-static, 4 
magnetoquasistatics, 12 


radiation condition, 8, 290 
absence of in FEM and FDTD, 14 


Rao-Wilton—Glisson element, see RWG element 


rat-race hybrid, 398 
RCS, 190 

bistatic, 193 

monostatic, 193 

of PEC sphere, 190 
rectangular waveguide, FEM solution of, 318 
residual, 140 
residue 

evaluation of, 255 
Riemann sheets, 249 
RWG element, 186-189 
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connection with edge-based finite element, 187, 
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SAR, 366, 367 

ICNIRP guidelines for, 365 
scan blindness, see phased arrays, scan blindness 
scattering 


incident/scattered field decomposition, 69, 73, 126 


overview of process, 69 
source inclusion, 73 
total field, 69 
scattering from a dielectric sphere, 197 
scattering from PEC sphere, 189-196 
analytical (Mie) solution , 193 
blue sky explanation of Lord Rayleigh, 189 
history of Mie solution, 193 
simplex coordinates 
one dimension, 303 
overview, 303 
properties of, 304 
three dimensions, 305 
two dimensions, 304 
useful formulae, 401-402 
singularities 
in EFIE and MFIE, 185 
in MPIE, 264 
slow wave, 280 
solution of linear equations, 123 
conjugate gradient algorithm, 209 
direct solvers, 208 
iterative solvers, 209, 375 
Sommerfeld potentials 
alternate treatments, 268 
computational efficiency of, 286 
definition of, 238-241 
derivation of single-layer microstrip, 242-244 
evaluation of, 247-260 
evaluation of tail, 269 
extension to aperture coupling, 268 
half-space problems, 269, 287 
history of, 233 
illustrative results, 259-260 
limitations of implementations, 287 
locating the pole, 258-259 
MoM solution using, 260-266 
multiple layers, 268 
numerical integration in spectral domain, 
249-258 
transmission matrix, 269 
wires penetrating interfaces, 269 
sparse matrices, 332 
sparse solvers, 372-378 
Compressed Column Storage, 375 
Compressed Row Storage, 374 
direct, 373 
iterative, 373 
profile-in skyline storage, 373 
results, 376-378 
Specific Absorption Rate, see SAR 
spectral domain, 231, 238-258 
transform, 233, 239 
spurious modes, see FEM,spurious modes 
stability 
effect of load on FDTD, 47 
of FDTD method, 43 
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statified medium 
definition, 231 
sub-cell models (FDTD) 
curved boundaries, 107 
MWS implementation, 112 
overview, 107 
thin cracks, 107 
thin sheets, 107 
thin wires, 107, 114 
Surface Equivalence Principle, 226 
Surface Equivalence Theorem, 196 
Love’s form, 196 
surface modelling (MoM) 
conducting structures, 184 
homogeneous material regions, 184, 196 
surface waves, 244, 246-247, 280 
condition for dominant TM only, 246 
position of poles, 247 
symmetric product, see functional analysis, 
symmetric product 


TE 
FDTD formulation for scattering, 70 
guided wave mode, 34, 318 
scattering, 69 
scattering from PEC cylinder, 86 
telegraphist’s equations, 34 
TEM 
guided wave modes, 33 
testing functions, see weighting functions 
testing points, 124 
thin-wire approximation 
electrodynamics, 128 
electroquasistatics, 120 
impact of, 125 
limitations on accuracy, 132, 150 
thin-wire codes 
MININEC, 149 
Wire (WIRE89), 149 
thin-wire modelling (MoM) 
source models, see source models 
arbitrarily orientated wires, 143 
T™T™ 
guided wave mode, 318, 399 
scattering, 69 
transmission line, 33 
Transmission Line Matrix method, 16 
Transverse Electric, see TE 
Transverse Magnetic, see TM 
triangle area 
signed, 295 


Uniform Theory of Diffraction, 5 


validation and verification, 19 
analytical solutions, 19 
approximate solutions, 19 
code comparisons, 20 
frequency selective surface example, 20 
measurements, 20 
of 1D FDTD problem, 44 
summary of for FEKO and NEC2, 182 


vector elements, 293, 309-317 
complete, 337-338 
contributions to, 310 
criticism of, 331 
CTILN, 311, 337 
hierarchal higher-order, definition of, 338-340 
hierarchal higher-order, impact on code, 342-343 
hierarchal higher-order, properties of, 340-342 
higher-order, 15, 337 
higher-order elements, alternate methods for 
constructing, 384 
interpolatory higher-order, 338, 384 
LT/LN, 339 
LT/QN, 337, 339, 352 
matching hierarchal elements to a field, 343 
mixed-order, 337-338 
QT/QN, 339 
volume modelling (MoM), 184 
application to human exposure assessment, 363 
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waveguide discontinuities 
FEM solution of, 345 
weighting functions, 140 
wide-band antennas 
compared to non-dispersive, 181 
definition of, 167 
wideband sources (FDTD), 50 
DC content of and FDTD simulations, 
51 
Gaussian Derivative pulse, 51, 84 
Gaussian pulse, 50 
polynomial pulse, 52 
Wiregrid for Windows, see NEC, Wiregrid for 
Windows 
Wu-King condition, 135 
Wu-King loaded dipole, 175 


Yagi-Uda antenna, 153 
Yee algorithm, see FDTD, Yee algorithm 


